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I, Mr. Lawrence Tannenbaum, hereby declare and state that: 

1 . I am the inventor of the above patent application. 

2. I am employed by The U.S. Army Center for Health Promotion and Preventive 
Medicine (USACHPPM) and have worked there for about 13 years. 

3. I am familiar with the rejection of the pending claims as set forth in the Office 
Action dated February 14, 2008. 

I. 1.131 Declaration: Antedating Cited References 

4. I set forth the following facts to antedate: 

(1) leradi et al., Folia Zoologica, Vol. 52, No. 1 (January 2003); 

(2) Ryabokon et al., Radia. Environ. Biophys. Vol. 44 (2005); and 

(3) Phillips et al., Federal Facilities Environmental Journal, Vol. 13, Issue 1 



Sir: 



(2002). 
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5. leradi et al. has an effective date of January 2003 . Ryabokon et al. has an 
effective date of 2005. As shown by the attached Exhibit A, Phillips et al. has an 
effective date of April 24. 2002 . 

6. Attached hereto as Exhibit B is a copy of Tannenbaum et al., Rodent sperm 
analysis in field-based ecological risk assessment: pilot study at Ravenna army 
ammunition plant, Ravenna, Ohio, Environmental Pollution 123 (2003) 21-29. As 
shown on the front page, the article was received for publication on 17 April 2002 . 

7. Exhibit B confirms that the present invention was actually reduced to practice 
at least as early as April 17, 2002 and prior to the effective dates of the leradi et al., 
Phillips et al., and Ryabokon et al. references. Thus, leradi et al., Phillips et al., and 
Ryabokon et al. are not prior art with respect to the present invention. 

8. All activities described in Exhibit B occurred in the United States of America. 

II. 1.132 Declaration: Enablement 
A. The Present Invention 

9. According to the present invention, if rodents at a contaminated site have 
impaired reproductive capability as determined by exceedances of sperm parameter 
benchmarks, then by implication, other site terrestrial mammals, have the potential to be 
experiencing similar reduced reproductive success. Rodents are the "perfect real-world, 
worst-case receptors of exposure" because they burrow in the contaminated soil, eat 
contaminated vegetation, and drink contaminated water. They typically do not migrate 
and many generations of rodents live in contaminated areas year after year. Thus, it is 
possible to make a determination about the potential health effects or risk to mammals 
at a contaminated site based on any exceedances of the sperm parameter benchmarks. 
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10. The present invention is enabled because (1) the use of rodent models to 
assess toxicological effects to other mammals is known; and (2) the use of animal 
sperm parameters in assessing human reproductive risks is known. 

B. The Use of Animal Models To Assess Toxicological Effects Is Routine 

1 1 . Sample et al., Toxicological Benchmarks for Wildlife: 1996 Revision. 
ES/ER/TM-86/R3. Oak Ridge National Laboratory, Oak Ridge, TN, USA is the most- 
often applied source of such toxicological benchmarks for health effects in mammals. A 
copy of this reference was filed with an Information Disclosure Statement on November 
6, 2007. 

12. In Sample et al., over 85% of the cited critical studies that are used in 
crafting toxicity reference values involve either mice or rats. For ecological risk 
assessments that evaluate non-human mammal species, rodents provide the 
toxicological basis for virtually all health determinations. 

13. For chemicals that regularly occur at contaminated terrestrial sites, it is 
important to note that in Sample et al., a reproduction endpoint forms the basis for more 
than 83% of the studies. 

14. Thus, in laboratory-based ecological risk assessment, a toxicological 
endpoint that occurs in the rodent is summarily assumed to also occur in the mammal 
species being evaluated. 

C. Use of Rodent Sperm Parameters to Estimate Human Effects 

15. Attached hereto as Exhibit C is a copy of U.S. Environmental Protection 
Agency, Guidelines for Reproductive Toxicity Risk Assessment, Office of Research and 
Development, Wash. D.C., USA (1996). 
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16. As stated in these EPA Guidelines: 

(a) In the absence of adequate human data, our understanding of the 
mechanisms controlling reproduction supports the use of data from experimental animal 
studies to estimate the risk of reproductive effects in humans (Part A, Section 1 , Page 

D; 

(b) An agent that produces an adverse reproductive effect in experimental animal 
studies is assumed to pose a potential threat to humans . This assumption is based on 
comparisons of data for agents that are known to cause human reproductive toxicity. In 
general, the experimental animal data indicated adverse reproductive effects that are 
also seen in humans (Part A, Section 1 , Pages 2-3). 

(c) Sperm morphology profiles are relatively stable and characteristic in a normal 
individual (and a strain within a species) over time. Sperm morphology is one of the 
least variable sperm measures in normal individuals, which may enhance its use in the 
detection of spermatotoxic events (Part A, Section 3.2.3.4.2, Pages 31-32). 

(d) The recent application of video and/or digital technology to sperm analysis 
allows a more detailed evaluation of sperm motion including information about the 
individual sperm tracks. It also provides permanent storage of the sperm tracks which 
can be reanalyzed as necessary (manually or computer-assisted) (Part A, Section 

3.2.3.4.3, Page 33). 

(e) Human male fertility is generally lower than that of test species and may be 
more susceptible to damage from toxic agents. Therefore, the conservative approach 
should be taken that, within the limits indicated in the sections on those parameters, 
statistically significant changes in measures of sperm count, morphology, or motility as 
well as number of normal sperm should be considered adverse effects (Part A, Section 

3.2.3.4.4, Page 34 - Adverse effects) . 

17. Accordingly, the use of animal (rodent) sperm parameters in conservatively 
assessing human reproductive risks is known. 
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D. The Working Article 

18. I have reviewed Working, Male Reproductive Toxicology: Comparison of the 
Human to Animal Models, Environmental Health Perspectives, Vol. 77, 37-44 (1988). 
The Working article supports the enablement of the present invention for the following 
reasons. 

1 9. Working states that "[c]omparison of the reduction in sperm count in 
humans and animals models after exposure to drugs or chemicals can be useful, 
provided that appropriate time points for the sperm count are selected." See Abstract 
and page 42. 

20. Using videomicrographic methods, Working states that sperm motility is a 
better indicator of toxic effect than sperm count. Similarly, Working discloses that 
methods for objective measurement of animal sperm morphology using 
videomicrographic methods permit a "more direct comparison of exposure-related 
changes in sperm head morphology in test species and the human male." See page 
43, right hand column. 

21. The present invention measures sperm count and sperm motility using 
Hamilton Thorne Integrated Visual Optics System (IVOS®) Sperm Analyzer. See 
paragraphs [0045]-[0046] of the published application. Thus, the present invention 
utilizes the accurate computer-assisted integrated visual optics type of system explicitly 
indicated in Working as providing a more objective basis in assessing humans risk from 
animal models. See attached Exhibit D, which is description of this device from the 
Hamilton Thorne Biosciences website. 

22. In view of Working, Sample et al., and the EPA guidelines, one of ordinary 
skill in the art would appreciate that rodent models directed to sperm parameters may 
be used to assess human ecological or health risks. 
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III. VERIFICATION 

23. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information and belief are believed to be true; and 
further that these statements were made with the knowledge that willful false statements 
and the like so made are punishable by fine or imprisonment, or both, under 1001 of 
Title 18 of the United States Code, and that such willful false statements may jeopardize 
the validity of the application or any patent issuing thereon. 





Mr. Lawrence Tannenbaum 
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Assessment of Potential Environmental Health Risks of Residue of High-Explosive 
Munitions on Military Test Ranges - Comparison in a Humid and Arid Climate 

Loren Phillips 1 , Bernard Perry 2 

1 U.S. Army Center for Health Promotion and Preventive Medicine (CHPPM), Aberdeen 
Proving Ground (APG), Maryland 

Environmental Quality Division, U.S. Army Developmental Test Command (DTC), APG, 

Maryland 

Abstract 

The U.S. Army Developmental Test Command (DTC) manages active ranges in a wide 
variety of environmental settings where personnel fire munitions of all calibers. Over time, 
physical, chemical, and biological processes may distribute munition fragments and residue 
on the range. Estimating the human health and ecological risks of these residues is a 
challenge because little information exists for evaluating the health effects of military-unique 
releases. A human health and ecological risk assessment at two DTC munition ranges (one in 
a humid, temperate climate (Aberdeen Proving Ground, Maryland), and one in a hot, arid 
climate (Yuma Proving Ground, Arizona) has been completed. The objective was to estimate 
health risks associated with exposure to munition residues. The scope included identifying the 
nature and extent of munition residue at firing points and in impact areas from firing 
munitions, analyzing exposure pathways, estimating the dose to living organisms and the 
response to the dose, and estimating risk from exposure. Where appropriate, water, soil, 
sediment, air, and biota samples were collected from the range area and reference areas. 
Samples were tested for explosives and metals. Risk to humans and ecosystem species was 
modeled. Results show that munition residue is not getting into the food chain or being 
transmitted by direct or indirect exposure in either climate despite low-level detections in 
some media. © 2002 Wiley Periodicals, lnc.[Note *} 
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Abstract 

Ecological risk assessment (ERA) guidance recommends that field-truthing efforts proceed when modeled hazard quotients 
(HQs) suggest that toxicological effects are occurring to site receptors. To date, no field methods have been proposed by the reg- 
ulatory community that can lead to definitive determinations of acceptable or unacceptable risk for birds and mammals, the two 
terrestrial classes of receptors that are commonly assessed using the HQ method. This paper describes rodent sperm analysis (RSA) 
as a viable method to be applied in the field at sites with historical contamination. RSA is capable of detecting biological differences 
that bear on reproduction, a highly regarded toxicological endpoint of concern in USEPA Superfund-type ERAs. The results of 
RSA's first application at a study site are reported and discussed. The paper also provides the rationale for RSA's efficacy in the 
context of Superfund and other environmental cleanup programs, where limited time and money are available to determine and 
evaluate the field condition. 
Published by Elsevier Science Ltd. 

Keywords: Ecological risk assessment; Field-truthing; Sperm; Rodents; Reproduction 



1. Introduction 

The current state of practice in screening-level and 
baseline ecological risk assessments (ERA) is the com- 
putation of hazard quotients (HQs) for a series of 
receptors (e.g., plants, invertebrates, birds, and mam- 
mals) that are representative of the site of interest. HQs 
(ratios of an animal's estimated daily dietary dose of a 
chemical to a reputedly safe dose of the same chemical) 
have notable limitations, and can only serve as mere risk 
screening tools (USEPA, 1989; Bartell, 1996). By them- 
selves, HQs do not demonstrate that receptors are 
actually at risk, and consequently HQs alone cannot 
justify a remedial action (e.g. excavation of soils) to 



* The opinions or assertions contained herein are the views of the 
authors and are not to be construed as official or as reflecting the views 
of the Department of the Army or the Department of Defense. 

* Corresponding author. Tel.: + 1-410-436-5210; fax: 1-410-436- 
8170. 

E-mail address: lawrence.tannenbaum@amedd.army.mil 
(L.V. Tannenbaum). 



ensure receptor protection, although some regulators 
may elect to use the HQ ratio this way. Recent EPA 
guidance (USEPA, 1997, 1998) acknowledges the 
imprecision in ERA modeling, and recommends that 
modeled results (that anticipate toxicological effects 
occurring in the wild on the basis of threshold HQ 
exceedance) be compared to field data as a check on 
whether the understanding of site conditions was cor- 
rect. To date, efforts to do so have been limited to the 
comparison of modeled and measured tissue con- 
centrations in plants and mice (Alsop et al., 1996), in 
order to address model over and underprediction. No 
formal field-truthing methodologies for the verification 
of modeled toxicological effects in terrestrial systems 
have been proposed or endorsed by the regulatory 
community. It would be prudent though, to develop 
such methods given the uncertainty in current modeling 
efforts, and given the frequency with which HQs are 
found to exceed the effects threshold of 1.0 (Duke and 
Taggart, 2000). This paper examines a field-truthing 
method, rodent sperm analysis (RSA), that documents 
the consequences of chemical exposure on reproductive 
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endpoints. The results of the method's test case at 
contaminated sites at Ravenna Army Ammunition 
Plant (RVAAP) in rural northeastern Ohio, suggest that 
RSA has the potential to address HQ concerns at other 
sites. 

The RSA method is predicated on three principal 
assumptions. First, if after several decades of exposure 
to contaminated site media, animals do not display 
physiological evidence of reproductive impact, it is rea- 
sonable to anticipate that future reproductive impacts 
will not arise. In most cases, contamination at Super- 
fund and similar sites is historical in nature (i.e., decades 
old). The short life spans of birds and mammals have 
allowed for many or multiple generations (possibly 
numbering in the hundreds) of exposure to con- 
taminated media. Second, where an observed reduction 
in the quality of conventional sperm parameters (e.g. 
sperm count, sperm motility, sperm morphology) is 
noted at the site of interest, and the site and the mat- 
ched reference location differ only in their soil chemis- 
tries, the observed sperm effects are due to the chemical- 
in-soil influences. These conventional sperm parameters 
when impaired were shown by Chapin et al. (1997) to 
correlate with reduced reproductive success. Third, 
when small rodents at a contaminated site are deemed 
to have impaired reproductive capability on the basis of 
lesser quality sperm parameters, by implication, other 
site terrestrial receptors have the potential to be experi- 
encing similar reduced reproductive success. This is a 
conservative assumption, because the degree of direct 
soil contact in wider-ranging mammals and birds (e.g., 
deer and hawks) is substantially less than that of small- 
ranging small rodents, and because biomagnifying 
compounds are not fully addressed by the method. Such 
assumptions must be made, however, if an ERA is to be 
brought to closure in a reasonable time frame. The 
method acknowledges that some chemicals may or may 
not interfere with reproduction, rather, they may affect 
other endpoints. Additionally, it is not known if dec- 
ades-old and weathered chemicals in soil and that 
(especially organic chemicals) may have an ongoing 
capacity to interrupt reproduction. 

The RSA method as applied in the RVAAP pilot 
study was intended to detect if any reproductive effects 
might be occurring in small mammals and other organ- 
isms with similar exposures. These receptors were the 
ones evaluated in the Phase II ERA (SAIC, 1999) for 
RVAAP's Winklepeck Burning Grounds (WBG). In 
conjunction with a weight of evidence approach 
(USEPA, 1999; e.g. small mammal species composition 
measures were also considered), the RSA method fos- 
tered comparisons of reproductive measures between 
contaminated study sites and clean reference sites. The 
screening-level assessment resulted in HQs that were 
frequently in the high multiple 100s and occasionally 
exceeding 1000 (SAIC, 1999). The HQs reflect chemical 



concentrations in soil resulting from historical site 
operations dating back to 1941. Site use at the 200-acre 
WBG included open burning (i.e. on slag-covered bare 
ground) for melting explosives out of heavy artillery pro- 
jectiles, and waste disposal of explosives (i.e. RDX, TNT, 
Composition B, black powder, propellant), antimony sul- 
fide, lead oxide, lead thiocyanate, sludge, sawdust from 
the installation's load lines, and domestic wastes. The 
open burning and detonation activities resulted in residual 
chemicals and metallic munitions fragments remaining on 
as many as 70 burning "pads" ranging in size from 50 
feetx70 feet to 75 feetx 1 10 feet (Fig. 1). 



2. Materials and methods 

2.1. Study sites 

Three WBG study sites, posing the greatest potential 
chemical stress to site receptors on the basis of HQ 
magnitude alone were selected (SAIC, 1999, 2000). The 
sites were geographically distinct from each other, such 
that small rodent home ranges at each did not overlap. 
Each study site consisted of two adjacent burning pads 
(pad pair numbers 37 and 38, 58 and 59, 66 and 67; 
Fig. 1) that for multiple receptors, shared high HQs for 
either metals, explosives, or a mixture of these chemical 
groups (Table 1). Corresponding reference sites (on 
RVAAP but more than one mile beyond the WBG 
boundary) for each two-pad grouping were selected 
based on similar soil, site history and other character- 
istics, during a preliminary field reconnaissance effort in 
the spring of 2000 (SAIC, 2000). Specific criteria for 
selection included hydrology, soil type, topography, 
site-use history, degree of maintenance (i.e. mowing), 
and plant community type. Care was taken to ensure 
that the reference sites offered the same level of resour- 
ces as the study sites with regard to their ability to 
attract and support White-footed mice {Peromyscus 
leucopus) and Meadow voles (Micro tus pennsylvanicus) 
and other animal life. An earlier small mammal survey 
(Carroll, 1999) indicated these were the two most 
numerous small rodent species at RVAAP, although the 
survey had not been extended to WBG. 

2.2. Animal trapping 

Four consecutive trap nights at each of three sites (at 
a time) were envisioned, with the reasonable expectation 
that this would result in the minimum number of target 
animals (27 each of White-footed mice and Meadow 
voles for each site) being trapped to support a rigorous 
statistical comparison. The 27 adult males corresponds 
to an alpha of 5%, a statistical power of 95%, and a 1:1 
ratio of significant difference (Sign. Diff.) to coefficient 
of variation (CV). Employing this ratio, whereby the 
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Fig. 1. Schematic of the Winklepeck Burning Grounds site at Ravenna Army Ammunition Plant (RVAAP). Some 70 burning pads, each measuring 
roughly 50 feetx70 feet to 75 feetx 1 10 feet are aligned in rows. The magnitude of the higher hazard quotients for the pads is depicted with a pattern 
(see legend). The three two-pad complexes that constituted the field study sites are demarcated (rectangles). Reference sites lie beyond the WBG 
boundary and are not shown here. 



Table 1 

Highest receptor-specific hazard quotients" by receptor at the burning pad sites M 



Pad pairs 




Hawk 


Owl 


Shrew 


Fox 


Cottontail rabbit 


37 & 38 


2650 - cadmium 


13 -zinc 




71 - cadmium 


31 - thallium 


4 - thallium 




1890 -lead 


7 - lead 


6 -lead 


64 - thallium 


2 - zinc 






45 - zinc 


4-DBP" 


3-DBP* 


92 - lead 






58 & 59 


1300 - lead 


60 - zinc 


55 -zinc 


212 -cadmium 


7 - thallium 


12 -antimony 




242 - cadmium 


3 - mercury 


4 -lead 


19 - antimony 


7 - zinc 








5 - lead 


2 - mercury 


12- thallium 


3 - cadmium 


4 - cadmium 


66&67 


1280 - lead 


60 -zinc 


7 - zinc 


706 -TNT 


361 - TNT 


709 - TNT 






5 - lead 


4 -lead 


207 - HMX 


229 - RDX 


208 - HMX 




16 - barium 






446 - RDX 


106 -HMX 


448 - RDX 



a The three highest chemical-specific hazard quotients are shown. 

b Corresponding toxicological effects for all hazard quotients, except those for RDX, are reproductive. The study endpoint for RDX is increased 
organ weight. 
c Di-n-butylphthalate. 

d Aluminum at numerous pads and for almost all receptors had hazard quotients in the multiple hundreds. Aluminum was determined to not be a 
contaminant of concern. 
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Sign. Diff. is selected relative to the CV rather than 
independent of it, allowed the sample size to be deter- 
mined without knowing the measured CV for the sperm 
parameters from the field. Statistically for a normally 
distributed sperm parameter, the selection of 20% (of 
the parameter mean) as the Sign. Diff., means that more 
than 99% of the results would fall within 2.5 standard 
deviations of the mean, and that one standard deviation 
would represent about 20% of the parameter's range. 

Animal trapping began in mid-spring 2000 after the 
season's first litters had matured to adulthood (verifi- 
able by pelage). Sherman live traps, baited with a rolled 
oats and peanut butter mix and a sweet feed mix for 
horses, were randomly placed in preferable habitat 
within a 50-m radius from the center of each of the pads 
(the three pad pairs). Seventy-five traps were set out on 
each pad (150 per site), effectively saturating the trap- 
ping area. During the field effort, heavy rains flooded 
the soils of both the burning pad and reference sites, 
causing rodents to migrate to higher ground and neces- 
sitating a delay of a second trapping event. All non-tar- 
get animals trapped were identified to species, weighed 
using a Pesola scale, and released. All female juvenile 
and sub-adult White-footed mice and Meadow voles 
and juvenile males were weighed in the field and also 
marked on the top of the head with a small dab of 
colored nail polish prior to release, so that animals 
trapped on subsequent nights would not be double- 
counted. Females were additionally assessed for repro- 
ductive state (lactating or pregnant). 

2.3. Sperm measurement 

Target animals (i.e. adult male White-footed mice and 
Meadow voles) were transported in their traps on the 
day following capture to the field laboratory. Target 
animals were euthanized with carbon dioxide, and liver 
weights were first recorded. For the assessment of sperm 
motility, the right vas deferens was surgically removed 
with care to minimize blood contamination. The excised 
tissue was immediately placed into a pre-warmed sus- 
pension medium containing 3 ml of phosphate buffered 
saline with 1% bovine serum albumin, and given a 
3-min "swim-out" period to allow sperm to enter the 
medium. A 100 urn cannula was then inserted into the 
medium to obtain a sample, and the cannula inserted 
into the retractable stage of a Hamilton-Thome inte- 
grated visual optics system (IVOS) sperm analyzer, for a 
general examination of sperm on the analyzer's main 
unit color monitor. The analyzer was preset to auto- 
matically move the stage to five different fields along the 
length of the cannula and to store each motion image 
(uniquely identified by study number, animal number, 
and cannula field number) on a Hewlett-packard write- 
once optical disk, creating a permanent record for pre- 
cise image reproduction and retrieval. Several weeks 



later, each image was recalled from the optical disk and 
analyzed for motile and non-motile cells. A percentmo- 
tility for all five recorded fields was determined for each 
animal, and other motility parameters (including 
straight-line, curvilinear, and path velocities; and pro- 
gressive motility and cross-beat frequency) were also 
calculated. 

The left epididymis was also removed following ani- 
mal euthanization, and frozen on dry ice. It was used to 
determine total sperm count and sperm abnormality 
(the percentage of misshapen sperm). The epididymis 
was thawed and the caudal section removed and 
weighed in order to report the total count as millions of 
sperm/gram of caudal epididymal tissue. It was then 
homogenized and a 100 ul sample added to a vial con- 
taining a fluorescent dye (Hoechst dye H33342) to stain 
the DNA in the sperm head, in order to prevent sur- 
rounding debris from being counted as sperm. A 9 ul 
sample was added to a slide which was cover-slipped, 
secured to the retractable stage, and then loaded into 
the IVOS. The analyzer automatically counted the 
stained sperm heads for 20 fields per slide, minimizing 
the sperm cell distribution variance within single 
samples. 

For sperm morphology, two slides were prepared 
from the epididymal sample prior to homogenization, 
and later stained with 5% eosin and cover-slipped for 
microsopic evaluation. Two-hundred sperm cells were 
evaluated with reverse phase/dark field microscopy 
(x40 objective) for head and tail abnormalities (size, 
shape, and double heads/tails), with the results reported 
as the percentage of the 200 sperm that were abnormal. 
The procedures followed for the evaluation of sperm 
count, sperm motility, and sperm morphology are those 
of Pathology Associates, A Charles River Company 
2002. 



3. Results 

3.1. Primary assessment metrics — sperm parameters 

Due to the weather and reduced trapping success, 
there were not enough adult male White-footed mice to 
perform pair-wise statistical comparisons between 
burning pad pairs and corresponding reference sites as 
planned. The burning pads and the reference sites were 
each pooled to preserve statistical integrity (described 
later). It was only possible to statistically compare the 
sperm parameters of mice because of the low vole cap- 
tures. Consistent with previous reports in the literature 
(Zenick et al., 1994), the sperm parameter with the 
greatest variability, as illustrated by the CV, was sperm 
count (Tables 2 and 3). Table 3 reports the results of the 
Wilcoxon rank sum test comparing medians of sperm 
parameters, and body and normalized liver weights of 



Table 2 

Sperm parameters comparison in field collected White-footed mice 



Sperm parameter 


Pooled reference areas 








Pooled burning pads 












S.D. 


CV 




Mean S.D. 


CV 


Sperm count (10 6 sperm/g tissue) 


8 1178.8-2241.9 


1670 


353.8 


21.2 


6 1229.5-1901.7 


1409 309 


21.9 


Sperm motility (percent) 


8 94-99 


98.4 


1.77 


1.8 


5' 98-100 


99.2 0.84 


0.8 


Sperm abnormality (percent) 


8 NA 


0.0 


0 


NA 


6 0-1 


0.0 0 


NA 



l = one sperm sample was lost, and did not allow for the motility measure; NA not applicable; « = sample size; S.D. = standard deviation; and 
CV = coefficient of variation. 



Table 3 

Statistical analysis of sperm parameter comparison 



Sperm parameter 


Mean of 


Mean of 


Percent 


Percent 


Agreement with expected 


Probability that observed 




reference are! 


is burning pads 


difference" 


difference/CV b 


direction of difference 11 


difference resulted from chance d 


Sperm count 


1670 


1409 


-16.71 


-0.78 


Yes 


0.114 


Sperm motility 


98.4 


99.2 


0.84 


0.55 


No 


0.093 


Sperm abnormality 


0.0 


0.0 


NA 


NA 


No 


NA 



a Calculated as mean of pooled burning pads minus mean of pooled reference areas/overall mean, multiplied by 100. 
b Calculated as the pooled standard deviation/overall mean. 

° The expectation is that contaminants at the burning pads will reduce sperm count, reduce sperm motility, and increase the percentage of mis- 
shapen sperm. 

d Probability calculated using one-sided Wilcoxon exact test assuming a power of 95%. 



the mice. There was no statistically significant differ- 
ence between burning pads and reference sites for 
either sperm parameter (sperm count and sperm moti- 
lity; sperm abnormality was 0% for both groups), 
although the lesser sperm count of the burning pad 
mice (a reduction of 16.7%) was in the direction 
expected (i.e. exposures to contaminants are known to 
reduce sperm production and will be explained in Sec- 
tion 4.3). However, variability of burning pad mice was 
less than the reference site mice, as demonstrated by a 
smaller range of values for two of the parameters (count 
and motility). The only attribute with a statistically 
significant difference was liver weight, with livers of 
mice from the burning pads 17.9% heavier than those 
of reference site mice. However, this difference dis- 
appears when the liver weight is normalized by body 
weight. 

The data was further evaluated to determine the 
minimum detectable difference between groups using 
the Wilcoxon Rank Sum Test. Table 4 lists the biologi- 
cal attributes in order of decreasing power to detect a 
20% difference between group means at an alpha of 
5%. Power ranged from 100 to 91% for sperm mor- 
phology, motility, body weight, and liver weight, with 
the power decreasing as the CV increased, respectively. 
It was, therefore, possible to detect a minimum differ- 
ence of 20% between groups for these four measures. A 
detectable difference of 20% was not possible for sperm 
count, which had a 48% power. However, the power to 
detect a minimum difference of 30% at an alpha of 5% 
for sperm count was 77%. 



3.2. Small mammal species composition and 
reproductive status 

A total of 152 small mammals representing 10 species 
were captured in the study during the two trapping ses- 
sions of 8 and 6 days, respectively (Table 5). The 
majority of these animals (58%) were target species. 
White-footed mice were present in nearly equal num- 
bers at the burning pads and the reference sites, 
while Meadow voles were five times more numerous 
at the burning pad sites. The reference sites had four 
non-target species present that were absent from the 
burning pads, and the burning pad sites had two 
non-target species present that were absent from the 
reference sites. For four of these non-target species, only 
one or two individuals constituted the captures. The 
greatest disparity in species composition between the 
burning pads and the reference sites, were captures 
only at the reference sites of Short-tailed shrews 
(Blarina brevicauda) and Eastern chipmunks (Tamias 
striatus). 

Table 6 provides the age structure and sex ratios for 
the target species. No apparent differences in age struc- 
ture and sex ratios were evident for White-footed mice 
between the burning pads and reference sites. Meadow 
vole age and sex ratios were similar to those of White- 
footed mice at the burning pad sites, however, the low 
number of captures at the reference sites did not allow a 
similar comparison for Meadow voles. Of the 12 female 
adult and sub-adult White-footed mice trapped at the 
reference sites, two (17%) were pregnant, and four 



Biological attribute 
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difference between biological attributes 



CV o 



Power at 5% a level 



Power at 5% a level 



For Sign. Diff. For 20% For 30% For Sign. Diff. For 20% For 30% 
equal to CV Sign. Diff. Sign. Diff. equal to CV Sign. Diff. Sign. Diff. 



Count (10 6 sperm/g tissue) 1558 
Motility (%) 98.7 



Body weight (grams) 21.73 
Normalized Liver Weight" 0.053 



:r weight as liver weight (g)/body weight (g). 



Burning Reference Total number 
pads sites trapped 



White-footed mouse 3 29 


33 


62 


Meadow vole" 22 


4 


26 


Eastern cottontail rabbit 2 


2 




Deer mouse 1 




1 


Masked shrew 1 




1 


Short-tailed shrew 


17 


17 


Eastern chipmunk 


36 


36 


Meadow jumping mouse 1 


1 


2 


Southern flying squirrel 


2 


2 


Woodland vole - 




1 


Total number animals trapped 55 


96 


152 


a Excludes recaptures. 






Table 6 






Age structure and sex ratios of key species" 




Species Adults 


Sub-adults 


Juveniles 


Males Females 


Males Femal 


es Males Females 


White-footed mice 






Reference sites 8 9 


5 3 


3 5 


Burning pads 7 5 


3 3 


7 4 



Meadow voles 
Reference sites 
Burning pads 



a Figures are numbers of animals trapped. 



(33%) were lactating. Of the eight female adult and sub- 
adults trapped at the burning pads, one (13%) was 
pregnant, and three (38%) were lactating. Thus, female 
reproductive status was similar. 



conservative nature of the method used, a more focused, 
field-based effort to establish whether signs of stress or 
impact are evident in site receptors should proceed. 

4.1. Rationale for using small mammals 

Given the impracticability (e.g. logistics, cost, poten- 
tial to decimate a population) of working with large, 
wide-ranging and higher trophic level terrestrial species, 
it is only feasible to collect and utilize small mammals. 
Small rodents are advantageous for study because they 
are generally plentiful in most habitats, relatively easy 
to capture, and trapping, handling, and euthanizing 
methods are commonly approved by institutional ani- 
mal care and use committees. Within an ERA context, 
they are additionally advantageous for use because of 
their small or limited home ranges, which can virtually 
guarantee that trapped specimens are coming in contact 
with sites of interest. Having the most direct contact 
with soil compared to other mammals and birds, and 
life spans that rarely exceed one year, small rodents are 
the species of choice. They are maximally exposed to 
site contaminants and have had multiple opportunities 
(i.e. successive generations) to display critical defects. 

Certain small mammals, however, are not as appro- 
priate for this type of study, especially where a mini- 
mum number of specimens is needed to ensure adequate 
statistical power for data analysis. For instance, shrews 
are not as suitable because of their exceedingly high 
metabolism that necessitates continuous feeding; these 
species cannot be expected to survive more than a few 
hours after trapping unless special precautions are taken 
(Butterfield et al., 1981; Shore et al, 1995; Little and 
Gurnell, 1989; Kutzageorgis and Mason, 1997). In 
summary, most small mammal species are ideal target 



4.2. Justification of the use of RSA in Superfund-type 
ERAs 



In theory when HQs above 1.0 are computed and 
cannot be rationalized through uncertainties or the 



In the context of sites managed under the USEPA 
Superfund program and similarly structured environ- 
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mental programs where time and money are limiting 
resources, the ultimate concern for any species within an 
ERA context is that it be able to survive and reproduce. 
This is reflected in the wording of assessment endpoints 
and the selection of toxicity reference values (TRVs). 
When there are several TRVs available for a given che- 
mical of concern, each with a different corresponding 
toxicological effect (e.g. excess liver weight, reproductive 
effect, excess salivation), the risk assessor will typically 
use the TRV based on reproductive endpoints. Evalua- 
tion of reproductive endpoints is also common to most 
routine field-based toxicity tests, regardless of the med- 
ium (e.g. soil, surface water) they address. However, the 
well being of terrestrial birds and mammals cannot be 
reasonably addressed by standardized soil toxicity tests 
(e.g. lettuce seed germination or earthworm survival). 
Other types of field tests are needed for them. 

Direct evaluation of reproductive parameters using 
RSA is an alternative to population studies that require 
prolonged periods of time in order to supply reliable 
data (Krebs, 1989; Seber, 1982), which Superfund and 
other related programs cannot support. Population 
monitoring (censusing) often requires many measure- 
ments over a multi-year period, and even then, one 
cannot usually distinguish between population impacts 
and natural differences of population cycling. Spatially, 
species presence-absence data are difficult to relate to 
chemical causation in any statistically rigorous way 
(Strayer, 1999), although such information may con- 
tribute to weight of evidence arguments. By contrast, 
RSA provides a direct measure of a population's ability 
to reproduce through time, despite the fact that the 
information is collected in a single sampling event. 

4.3. Evidence of causal relationship of chemicals to 
reproductive metrics 

Although the RSA method does not absolutely 
require that any of a site's chemicals of concern be 
known reproductive toxins, there is a substantial body 
of evidence demonstrating that sperm parameter 
impairment and other related reproductive effects are 
chemically based. Direct evidence of sperm parameter 
effects in rodents exists for the metals, aluminum (Llo- 
bet et al. 1994), arsenic (Pant et al., 2001), and lead 
(Wadi and Ahmad, 1999), and for the explosives 1,3- 
DNB (Under et al., 1986; Cody et al., 1981) and 1,3,5- 
TNB (Reddy et al., 1996). Aside from many of the 
above compounds being commonly encountered at 
contaminated sites (including army sites), all of these 
were identified in soils of the burning pads of the WBG 
study, thereby substantiating that sperm parameters, as 
surrogate measures of reproductive success, were 
appropriate for study. Additional substantiation of the 
appropriateness of tracking sperm quality in rodents, 
derives from other laboratory studies, where chemical 



exposure may produce a host of reproductive tox- 
icological endpoints. These include fewer successful 
matings, fewer litters, smaller litters, and also semi- 
niferous tubule degeneration or reduced seminiferous 
tubules. Although these studies do not evaluate sperm 
directly, it is very likely that the endpoints measured 
(and particularly those concerning seminiferous tubules) 
are sperm-mediated. The explosives 2,4,6-TNT and 
RDX, detected at the WBG burning pads, have both 
been shown to cause seminiferous tubule effects (Dilley 
et al, 1982; Levine et al., 1984; Lish et al, 1984, 
respectively) in addition to testicular atrophy and testi- 
cular degeneration. 

4.4. Rationale for assessing male reproduction 

Female reproductive success measures would be 
highly desirable in a field-truthing effort. However, they 
involve additional complications because nearly all 
necessitate mating studies. Wild rodents have a history 
of not breeding well under indoor conditions, and ade- 
quate numbers of animals may not be able to be pro- 
cured from the field to provide for a sufficiently large 
mating study (the case at RVAAP's WBG). Hence, it is 
common practice to pair two females with one male 
when such studies are conducted. Sufficient time would 
also have to be allowed to elapse in order for usable 
data to be gleaned from females, as their reproductive 
biology (e.g., estrous cycles) is easily offset by the stres- 
ses of having been trapped, conveyed to indoor housing, 
and handled. Evaluating any female reproductive 
measure, including those that do not require mating 
studies (e.g. estrous cycle length), assumes that animals 
will acclimatize well; necessitates additional time, cost, 
and animal housing; and shows the potential to allow 
for animal recovery with the obscuring of effects. In 
contrast, conventional sperm parameters (count, moti- 
lity, morphology) which have been shown to be affected 
by chemical exposures and to correlate with reduced 
reproductive success (Chapin et al, 1997), are unaf- 
fected in trapped and handled rodents. In addition to 
collection of the RSA data being time- and cost-effec- 
tive, population impacts are limited because only males 
are sacrificed. 

4.5. The utility of RSA measures as assessment 
endpoints 

A critical issue that biologists face with interpreting 
field-based data is how much of a difference between 
groups is biologically or ecologically significant. In the 
case of conventional sperm parameters, the literature 
provides thresholds for parameter impairment at which 
reduced reproductive success can be expected to occur 
in the field (Chapin et al., 1997). Numerous studies with 
mice and rats indicate that these species are robustly 
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fertile, and that sperm count needs to be reduced 
approximately 80% or more before reproductive success 
is compromised (Meistrich et al. 1994; Bucci and Meis- 
trich, 1987; Gray et al., 1992). In a comprehensive 
multi-generational study with Swiss mice (Chapin et al., 
1997), normal fertility was maintained in treated ani- 
mals until motility declined to approximately 40-50%, 
and furthermore, all treated groups with less than 37% 
motile sperm had fewer pups than normal. In this same 
study, sperm morphology was found to be the most sen- 
sitive sperm parameter, with clearly associated reduced 
fertility occurring when the control range (2-12% 
abnormally shaped) was only slightly exceeded (16%). 
With further study in wild rodents, these various degrees 
of difference could be narrowed and bona fide field-based 
thresholds could be developed with RSA methods. 

4.6. Interpretation of results of the R VAAP pilot study 

The animal capture information can be used to sup- 
port the information provided by the primary assess- 
ment (i.e. sperm parameter) metrics. The capture 
information indicates that the key species are not being 
excluded from, and are not avoiding the WBG burning 
pads, reputedly one of the most contaminated portions 
of RVAAP. Although the captured animal numbers are 
small, the field measurements show that females are not 
any less reproductively active at the burning pads than 
at the reference sites, based on the percentages of lac- 
tating and pregnant individuals. 

Results of the sperm analysis show that male White- 
footed mice are not reproductively impaired, because 
there were no statistically significant differences (P > 0.05) 
between groups (i.e. count and motility). The CVs for all 
three sperm parameters (see Table 2) of mice trapped at 
both the burning pads and the reference sites are con- 
sistent with those reported in the literature for rodents 
(Zenick et al., 1994), with count being the most variable, 
motility having only slight variability, and an abnorm- 
ality rate as low as 1%. Because rodents produce 10-20 
times more sperm than needed to ensure full repro- 
ductive success (Meistrich et al, 1994; Bucci and 
Meistrich, 1987; Gray et al., 1992), it is a safe assump- 
tion that the approximate 17% reduction in the sperm 
count of WBG mice is inconsequential, even had the 
difference been statistically significant. Collectively, the 
trapping results and sperm parameters for mice mean 
reproductive success for these terrestrial receptors at 
WBG. 



5. Conclusions 

Results from the RSA method allowed us to arrive at 
a determination of acceptable risk for mammals at the 
WBG burning pads. Although desirable numbers of 



target animals were not captured, the small CVs for the 
sperm parameters allowed good statistical confidence in 
the study results. RSA indicated that reproductive 
effects, as an assessment endpoint, were not evident in 
the exposed population despite the fact that the HQ 
calculations of the initial desktop assessment had indi- 
cated otherwise. 

The finding of no unacceptable risk lends support to 
those contentions that ecological HQs are misleading 
numbers because they overestimate the prevalence of 
toxicological effects in the field (USEPA, 1997, 1998; 
Alexander, 2000; Tannenbaum, 2001; Tannenbaum et 
al., in press). The results also suggest that there is more 
to be gained by advancing to the field for a verification 
endeavor, rather than conducting mathematically 
focused second and third tier ERAs. 

Although small mammals at the contaminated sites 
did not display any adverse reproductive effects, a sig- 
nificant difference (P< =0.05) in liver weights was evi- 
dent. When liver weights were normalized to body 
weight, however, differences were not evident. Regard- 
less, fresh liver weights in captured small rodents is a 
useful as a measure of exposure and liver measurement 
should remain as a fixed component of the RSA 
method. 

We believe that at sites where the in-place con- 
tamination is one or more decades old, and where birds 
and mammals consequently have had multigenerational 
exposures, that detrimental effects should be either pre- 
sent or not present. In the latter case, such effects either 
never occurred, or did first occur but were followed 
by a period of ecological recovery. The RVAAP pilot 
study was designed to detect critical detrimental (i.e. 
reproductive) effects if such were present. Based on 
the study's outcome, we conclude that RSA repre- 
sents an reasonable, practical and cost-attractive field- 
oriented technique. The field method allows for 
much better closure on animal aspects of ERA risk 
than depending on HQs and other mathematical 
predictions. 
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GUIDELINES FOR REPRODUCTIVE TOXICITY RISK ASSESSMENT 
[FRL-5630-6] 



AGENCY: U.S. Environmental Protection Agency 

ACTION: Notice of availability of final Guidelines for Reproductive Toxicity Risk Assessment 

SUMMARY: The U.S. Environmental Protection Agency (EPA) is today publishing in final form a 
document entitled Guidelines for Reproductive Toxicity Risk Assessment (hereafter "Guidelines"). 
These Guidelines were developed as part of an interoffice guidelines development program by a 
Technical Panel of the Risk Assessment Forum. They were proposed initially in 1988 as separate 
guidelines for the female and male reproductive systems. Subsequently, based upon the public 
comments and Science Advisoiy Board (SAB) recommendations, changes made included combining 
those two guidelines, integrating the hazard identification and dose-response sections, assuming as a 
default that an agent for which sufficient data were available on only one sex may also affect 
reproductive function in the other sex, expansion of the section on interpretation of female endpoints, 
and consideration of the benchmark dose approach for quantitative risk assessment. These Guidelines 
were made available again for public comment and SAB review in 1994. This notice describes the 
scientific basis for concern about exposure to agents that cause reproductive toxicity, outlines the 
general process for assessing potential risk to humans from exposure to environmental agents, and 
addresses Science Advisory Board and public comments on the 1994 Proposed Guidelines for 
Reproductive Toxicity Risk Assessment . Subsequent reviews have included the Agency's Risk 
Assessment Forum and interagency comment by members of subcommittees of the Committee on the 
Environment and Natural Resources of the Office of Science and Technology Policy. The EPA 
appreciates the efforts of all participants in the process and has tried to address their recommendations 
in these Guidelines. 

EFFECTIVE DATE: The Guidelines will be effective October 31, 1996. 

viii 



ADDRESSES: The Guidelines will be made available in the following ways: 

(1) The electronic version will be accessible on EPA's Office of Research and Development 
home page on the Internet at http://www.epa.gov/ORD/WebPubs/repro/. 

(2) 3 V2-inch high-density computer diskettes in WordPerfect 5.1 will be available from ORD 
Publications, Technology Transfer and Support Division, National Risk Management Research 
Laboratory, Cincinnati, OH; telephone: 513-569-7562; fax: 513-569-7566. Please provide the EPA 
No. (EPA/63 0/R-96/009a) when ordering. 

(3) This notice contains the full document. In addition, copies of the Guidelines will be 
available for inspection at EPA headquarters in the Air and Radiation Docket and Information Center 
and in EPA headquarters and regional libraries. The Guidelines also will be made available through the 
U.S. Government Depositoiy Library program and for purchase from the National Technical 
Information Service (NTIS), Springfield, VA; telephone: 703-487-4650; fax: 703-321-8547. Please 
provide the NTIS PB No. (PB97-100093) when ordering. 

FOR FURTHER INFORMATION, CONTACT: Dr. Eric D. Clegg, Effects Identification and 
Characterization Group, National Center for Environmental Assessment- Washington Division (8623D), 
U.S. Environmental Protection Agency, 401 M Street, S.W., Washington, DC 20460; telephone: 
202-564-3297; e-mail: clegg.eric@epamail.epa.gov. 

SUPPLEMENTARY INFORMATION: 
A. APPLICATION OF THE GUIDELINES 

The EPA is authorized by numerous statutes, including the Toxic Substances Control Act 
(TSCA), the Federal Insecticide, Fungicide, and Rodenticide Act (FIFRA), the Clean Air Act, the Safe 
Drinking Water Act, and the Clean Water Act, to regulate environmental agents that have the potential 
to adversely affect human health, including the reproductive system. These statutes are implemented 
through offices within the Agency. The Office of Pesticide Programs and the Office of Pollution 
Prevention and Toxics within the Agency have issued testing guidelines (U.S. EPA, 1982, 1985b, 
1996a) that provide protocols designed to determine the potential of a test substance to produce 
reproductive (including developmental) toxicity in laboratory animals. Proposed revisions to these 
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testing guidelines are in the final stages of completion (U.S. EPA, 1996a). The Organization for 
Economic Cooperation and Development (OECD) also has issued testing guidelines (which are under 
revision) for reproduction studies (OECD, 1993b). 

These Guidelines apply within the framework of policies provided by applicable EPA statutes 
and do not alter such policies. They do not imply that one kind of data or another is prerequisite for 
action concerning any agent. The Guidelines are not intended, nor can they be relied upon, to create 
any rights enforceable by any party in litigation with the United States. This document is not a 
regulation and is not intended to substitute for EPA regulations. These Guidelines set forth current 
scientific thinking and approaches for conducting reproductive toxicity risk assessments. EPA will 
revisit these Guidelines as experience and scientific consensus evolve. 

The procedures outlined here in the Guidelines provide guidance for interpreting, analyzing, and 
using the data from studies that follow the above testing guidelines (U.S. EPA 1982, 1985b, 1996a). 
In addition, the Guidelines provide information for interpretation of other studies and endpoints (e.g., 
evaluations of epidemiologic data, measures of sperm production, reproductive endocrine system 
function, sexual behavior, female reproductive cycle normality) that have not been required routinely, 
but may be required in the future or may be encountered in reviews of data on particular agents. The 
Guidelines will promote consistency in the Agency's assessment of toxic effects on the male and female 
reproductive systems, including outcomes of pregnancy and lactation, and inform others of approaches 
that the Agency will use in assessing those risks. More specific guidance on developmental effects is 
provided by the Guidelines for Developmental Toxicity Risk Assessment (U.S. EPA, 1991). Other 
health effects guidance is provided by the Guidelines for Carcinogen Risk Assessment (U.S. EPA, 
1986a, 1996b), the Guidelines for Mutagenicity Risk Assessment (U.S. EPA, 1986c), and the 
Proposed Guidelines for Neurotoxicity Risk Assessment (U.S. EPA, 1995a). These Guidelines and 
the four cited above are complementary. 

The Agency has sponsored or participated in several conferences that addressed issues related 
to evaluations of reproductive toxicity data which provide some of the scientific bases for these risk 
assessment guidelines. Numerous publications from these and other efforts are available which provide 
background for these Guidelines (U.S. EPA, 1982, 1985b, 1995b; Galbraith et al, 1983; OECD, 
1983; U.S. Congress, 1985, 1988; Kimmel, C.A. et al, 1986; Francis and Kimmel, 1988; Burger et 
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al, 1989; Sheehan et al, 1989; Seed et al, 1996). Also, numerous resources provide background 
information on the physiology, biochemistry, and toxicology of the male and female reproductive 
systems (Lamb and Foster, 1988; Working, 1989; Russell et al, 1990; Atterwill and Flack, 1992; 
Scialli and Clegg, 1992; Chapin and Heindel, 1993; Heindel and Chapin, 1993; Paul, 1993; Manson 
and Kang, 1994; Zenick et al, 1994; Kimmel, G.L. et al, 1995; Witorsch, 1995). A comprehensive 
text on reproductive biology also has been published (Knobil et al, 1994). 

B. ENVIRONMENTAL AGENTS AND REPRODUCTIVE TOXICITY 

Disorders of reproduction and hazards to reproductive health have become prominent public 
health issues. A variety of factors are associated with reproductive system disorders, including 
nutrition, environment, socioeconomic status, lifestyle, and stress. Disorders of reproduction in humans 
include but are not limited to reduced fertility, impotence, menstrual disorders, spontaneous abortion, 
low birth weight and other developmental (including heritable) defects, premature reproductive 
senescence, and various genetic diseases affecting the reproductive system and offspring. 

The prevalence of infertility, which is defined clinically as the failure to conceive after one year 
of unprotected intercourse, is difficult to estimate. National surveys have been conducted to obtain 
demographic information about infertility in the United States (Mosher and Pratt, 1990). In their 1988 
survey, an estimated 4.9 million women ages 15-44 (8.4%) had impaired fertility. The proportion of 
married couples that was infertile, from all causes, was 7.9%. 

Carlsen et al. (1992) have reported from a meta analysis that human sperm concentration has 
declined from 1 13 x 10 6 per mL of semen prior to 1960 to 66 x 10 6 per mL subsequently. When 
combined with a reported decline in semen volume from 3.4 mL to 2.75 mL, that suggests a decline in 
total number of sperm of approximately 50%. Increased incidence of human male hypospadias, 
cryptorchidism, and testicular cancer have also been reported over the last 50 years (Giwercman et al, 
1993). Several other retrospective studies that examined semen characteristics from semen donors 
have obtained conflicting results (Auger et al, 1995; Bujan et al, 1996; Fisch et al, 1996; Ginsburg et 
al, 1994; Irvine et al, 1996; Paulsen et al, 1996; Van Waeleghem et al, 1996; Vierula et al, 1996). 
While concerns exist about the validity of some of those conclusions, the data indicating an increase in 
human testicular cancer, as well as possible occurrence of other plausibly related effects such as 
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reduced sperm production, hypospadias, and cryptorchidism, suggest that an adverse effect may have 
occurred. However, there is no definitive evidence that such adverse human health effects have been 
caused by environmental chemicals. 

Endometriosis is a painful reproductive and immunologic disease in women that is characterized 
by aberrant location of uterine endometrial cells, often leading to infertility. It affects approximately five 
million women in the United States between 15 and 45 years of age. Very limited research has 
suggested a link between dioxin exposure and development of endometriosis in rhesus monkeys (Rier 
et al, 1993). Gerhard and Runnebaum (1992) reported an association in women between occurrence 
of endometriosis and elevated blood PCB levels, while a subsequent small clinical study found no 
significant correlations between disease severity in women and serum levels of halogenated aromatic 
hydrocarbons (Boyd et al, 1995). 

Even though not all infertile couples seek treatment, and infertility is not the only adverse 
reproductive effect, it is estimated that in 1986, Americans spent about $1 billion on medical care to 
treat infertility alone (U.S. Congress, 1988). With the increased use of assisted reproduction 
techniques in the last 10 years, that amount has increased substantially. 

Disorders of the male or female reproductive system may also be manifested as adverse 
outcomes of pregnancy. For example, it has been estimated that approximately 50% of human 
conceptuses fail to reach term (Hertig, 1967; Kline et al., 1989). Methods that detect pregnancy as 
early as eight days after conception have shown that 32%-34% of postimplantation pregnancies end in 
embryonic or fetal loss (Wilcox et al., 1988; Zinaman et al, 1996). Approximately 3% of newborn 
children have one or more significant congenital malformations at birth, and by the end of the first 
postnatal year, about 3% more are recognized to have serious developmental defects (Shepard, 1986). 
Of these, it is estimated that 20% are of known genetic transmission, 10% are attributable to known 
environmental factors, and the remaining 70% result from unknown causes (Wilson, 1977). Also, 
approximately 7.4% of children have low birth weight (i.e., below 2.5 kg) (Selevan, 1981). 

A variety of developmental alterations may be detected after either pre- or postnatal exposure. 
Several of these are discussed in the Guidelines for Developmental Toxicity Risk Assessment (U.S. 
EPA, 1991), and developmental neurotoxicity is discussed in the Proposed Guidelines for 
Neurotoxicity Risk Assessment (U.S. EPA, 1996a). Relative to developmental reproductive 
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alterations, chemical or physical agents can affect the female and male reproductive systems at any time 
in the life cycle, including susceptible periods in development. The reproductive system begins to form 
early in gestation, but structural and functional maturation is not completed until puberty. Exposure to 
toxicants early in development can lead to alterations that may affect reproductive function or 
performance well after the time of initial exposure. Examples include the actions of estrogens, anti- 
androgens or dioxin in interfering with male sexual differentiation (Gill et al, 1979; Gray et al, 1994, 
1995; Giusti et al, 1995; Gray and Ostby, 1995). Adverse effects such as reduced fertility in offspring 
may appear as delayed consequences of in utero exposure to toxicants. Effects of toxic agents on other 
parameters such as sexual behavior, reproductive cycle normality, or gonadal function can also alter 
fertility (Chapman, 1983; Dixon and Hall, 1984; Schrag and Dixon, 1985b; U.S. Congress, 1985). 
For example, developmental exposure to environmental compounds that possess steroidogenic 
(Mattison, 1985) or antisteroidogenic (Schardein, 1993) activity affect the onset of puberty and 
reproductive function in adulthood. 

Numerous agents have been shown to cause reproductive toxicity in adult male and female 
laboratory animals and in humans (Mattison, 1985; Schrag and Dixon, 1985a,b; Waller et al., 1985; 
Lewis, 1991). In adult males and females, exposure to agents of abuse, e.g., cocaine, disrupts normal 
reproductive function in both test species and humans (Smith, C.G. and Gilbeau, 1985). Numerous 
chemicals disrupt the ovarian cycle, alter o\ illation, and impair fertility in experimental animals and 
humans. These include agents with steroidogenic activity, certain pesticides, and some metals (Thomas, 
1981; Mattison, 1985). In males, estrogenic compounds can be testicular toxicants in rodents and 
humans (Colborn et al, 1993; Toppari et al., 1995). Dibromochloropropane (DBCP) impairs 
spermatogenesis in both experimental animals and humans by another mechanism. These and other 
examples of toxicant-induced effects on reproductive function hav e been reviewed (Katz and 
Overstreet, 1981; Working, 1988). 

Altered reproductive health is often manifested as an adv erse effect on the reproductive success 
or sexual behavior of the couple even though only one of the pair may be affected directly. Often, it is 
difficult to discern which partner has reduced reproductive capability. For example, exposure of the 
male to an agent that reduces the number of normal sperm may result in reduced fertility in the couple, 
but without further diagnostic testing, the affected partner may not be identified. Also, adverse effects 
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on the reproductive systems of the two sexes may not be detected until a couple attempts to conceive a 
child. 

For successful reproduction, it is critical that the biologic integrity of the human reproductive 
system be maintained. For example, the events in the estrous or menstrual cycle are closely 
interrelated; changes in one event in the cycle can alter other events. Thus, a short or inadequate luteal 
phase of the menstrual cycle is associated with disorders in ovarian follicular steroidogenesis, 
gonadotropin secretion, and endometrial integrity (McNatty, 1979; Scommegna et al, 1980; Smith, 
S.K. et al., 1984; Sakai and Hodgen, 1987). Toxicants may interfere with luteal function by altering 
hypothalamic or pituitary function and by affecting ovarian response (La Bella et al, 1973a,b). 

Fertility of the human male is particularly susceptible to agents that reduce the number or quality 
of sperm produced. Compared with many other species, human males produce fewer sperm relative to 
the number of sperm required for fertility (Amann, 1981; Working, 1988). As a result, many men are 
subfertile or infertile (Amann, 1981 ). The incidence of infertility in men is considered to increase at 
sperm concentrations below 20 x 10 6 sperm per mL of ejaculate. As the concentration of sperm drops 
below that level, the probability of a pregnancy resulting from a single ejaculation declines. If the 
number of noirnal sperm per ejaculate is sufficiently low, fertilization is unlikely and an infertile condition 
exists. However, some men with low sperm concentrations are able to achieve conception and many 
subfertile men have concentrations greater than 20 x 10 6 , illustrating the importance of sperm quality. 
Toxic agents may further decrease production of sperm and increase risk of impaired fertility. 

C. THE RISK ASSESSMENT PROCESS AND ITS APPLICATION TO 
REPRODUCTIVE TOXICITY 

Risk assessment is the process by which scientific judgments are made concerning the potential 
for toxicity to occur in humans. In 1983, the National Research Council (NRC) defined risk 
assessment as comprising some or all of the following components: hazard identification, dose- 
response assessment, exposure assessment, and risk characterization (NRC, 1983). In its 1994 report, 
Science and Judgment in Risk Assessment , the NRC extended its view of the paradigm to include 
characterization of each component (NRC, 1994). hi addition, it noted the importance of an interactive 
approach that deals with recurring conceptual issues that cut across all stages of risk assessment. 
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These Guidelines adopt an interactive approach by organizing the process around the components of 
hazard characterization, the quantitative dose-response analysis, the exposure assessment, and the risk 
characterization where hazard characterization combines hazard identification with qualitative 
consideration of dose-response relationships, route, timing, and duration of exposure. This is done 
because, in practice, hazard identification for reproductive toxicity and other noncancer health effects 
includes an evaluation of dose-response relationships, route, timing, and duration of exposure in the 
studies used to identify the hazard. Determining a hazard often depends on whether a dose-response 
relationship is present (Kirnmel, C.A. et al., 1990). This approach combines the information important 
in comparing the toxicity of a chemical to potential human exposure scenarios identified as part of the 
exposure assessment. Also, it minimizes the potential for labeling chemicals inappropriately as 
"reproductive toxicants" on a purely qualitative basis. 

In hazard characterization, all available experimental animal and human data, including 
observed effects, associated doses, routes, timing, and duration of exposure, are examined to determine 
if an agent causes reproductive toxicity in that species and, if so, under what conditions. From the 
hazard characterization and criteria provided in these Guidelines, the health-related database can be 
characterized as sufficient or insufficient for use in risk assessment (Section 3.7). This approach does 
not preclude the evaluation and use of the data for other purposes when adequate quantitative 
information for setting reference doses (RfDs) and reference concentrations (RfCs) is not available. 

The next step, the quantitative dose-response analysis (Section 4), includes detennining the 
no-observed-adverse-effect-level (NOAEL) and/or the lowest-observed-adverse-effect-level 
(LOAEL) for each study and type of effect. Because of the limitations associated with the use of the 
NOAEL, the Agency is beginning to use an additional approach, the benchmark dose approach 
(Crump, 1984; U.S. EPA. 1995b), for a more quantitative dose-response evaluation when allowed by 
the data. The benchmark dose approach takes into account the variability in the data and the slope of 
the dose-response curve, and thus, provides more complete use of the data for calculation of the RfD 
or RfC. If the data are considered sufficient for risk assessment, and if reproductive toxicity occurs at 
the lowest toxic dose level (i.e., the critical effect), an RfD or RfC, based on adverse reproductive 
effects, could be derived. This RfD or RfC is derived using the NOAEL or benchmark dose divided 
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by uncertainty factors to account for interspecies differences in response, intraspecies variability and 
deficiencies in the database. 

Exposure assessment identifies and describes populations exposed or potentially exposed to 
an agent, and presents the type, magnitude, frequency, and duration of such exposures. Those 
procedures are considered separately in the Guidelines for Exposure Assessment (U.S. EPA, 1992). 
However, unique considerations for reproductive toxicity exposure assessments are detailed in Section 
5. 

A statement of the potential for human risk and the consequences of exposure can come only 
from integrating the hazard characterization and quantitative dose-response analysis with human 
exposure estimates in the risk characterization. As part of risk characterization, the strengths and 
weaknesses in each component of the risk assessment are summarized along with major assumptions, 
scientific judgments, and to the extent possible, qualitative descriptions and quantitative estimates of the 
uncertainties. 

In 1992, EPA issued a policy memorandum (Habicht, 1992) and guidance package on risk 
characterization to encourage more comprehensive risk characterizations, to promote greater 
consistency and comparability among risk characterizations, and to clarify the role of professional 
judgment in characterizing risk. In 1995, the Agency issued a new risk characterization policy and 
guidance (Browner, 1995) that refines and reaffirms the principles found in the 1992 policy and outlines 
a process within the Agency for implementation. Although specific program policies and procedures 
are still evolving, these Guidelines discuss attributes of the Agency's risk characterization policy as it 
applies to reproductive toxicity. 

Risk assessment is just one component of the regulatory process. The other component, risk 
management, uses risk characterization along with directives of the enabling regulatory legislation and 
other factors to decide whether to control exposure to the suspected agent and the level of control. 
Risk management decisions also consider socioeconomic, technical, and political factors. Risk 
management is not discussed directly in these guidelines because the basis for decisionmaking goes 
beyond scientific considerations alone. However, the use of scientific information in this process is 
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discussed. For example, the acceptability of the margin of exposure (MOE) is a risk management 
decision, but the scientific bases for generating this value are discussed here. 



Dated: October 1 5 , 1 996 Signed by EPA Administrator 

Carol M. Browner 
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PART A: GUIDELINES FOR REPRODUCTIVE TOXICITY RISK ASSESSMENT 



1. OVERVIEW 

These Guidelines describe the procedures that the EPA follows in using existing data to 
evaluate the potential toxicity of environmental agents to the human male and female reproductive 
systems and to developing offspring. These Guidelines focus on reproductive system function as it 
relates to sexual behavior, fertility, pregnancy outcomes, and lactating ability, and the processes that can 
affect those functions directly. Included are effects on gametogenesis and gamete maturation and 
function, the reproductive organs, and the components of the endocrine system that directly support 
those functions. These Guidelines concentrate on the integrity of the male and female reproductive 
systems as required to ensure successful procreation. They also emphasize the importance of 
maintaining the integrity of the reproductive system for overall physical and psychologic health. The 
Guidelines for Developmental Toxicity Risk Assessment (U.S. EPA, 1991) focus specifically on 
effects of agents on development and should be used as a companion to these Guidelines. 

In evaluating reproductive effects, it is important to consider the presence, and where possible, 
the contribution of other manifestations of toxicity such as mutagenicity or carcinogenicity as well as 
other forms of general systemic toxicity. The reproductive process is such that these areas overlap, and 
all should be considered in reproductive risk assessments. Although the endpoints discussed in these 
Guidelines can detect impairment to components of the reproductive process, they may not discriminate 
effectively between nonmutagenic (e.g., cytotoxic) and mutagenic mechanisms. Examples of endpoints 
affected by either type of mechanism are sperm head morphology and preimplantation loss. If the 
effects seen may result from mutagenic events, then there is the potential for transmissible genetic 
damage. In such cases, the Guidelines for Mutagenicity Risk Assessment (U.S. EPA, 1986c) should 
be consulted in conjunction with these Guidelines. The Guidelines for Carcinogen Risk Assessment 
(U.S. EPA, 1986a, 1996b) should be consulted if reproductive system or developmentally induced 
cancer is detected. 

For assessment of risk to the human reproductive systems, the most appropriate data are those 
derived from human studies having adequate study design and power. In the absence of adequate 
human data, our understanding of the mechanisms controlling reproduction supports the use of data 
from experimental animal studies to estimate the risk of reproductive effects in humans. However, 
some information needed for extrapolation of data from experimental animal studies to humans is not 
generally available. Therefore, to bridge these gaps in information, a number of default assumptions are 
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made. These default assumptions, which are summarized in Table 1, should not preclude inquiry into 
the relevance of the data to potential human risk and should be invoked only after examination of the 
available information indicates that necessity. These assumptions provide the inferential basis for the 
approaches to risk assessment in these Guidelines. Each assumption should be evaluated along with 
other relevant information in making a final judgment as to human risk for each agent, and that 
information summarized in the risk characterization. 

An agent that produces an adverse reproductive effect in experimental animal studies is 
assumed to pose a potential reproductive threat to humans. This assumption is based on comparisons 
of data for agents that are known to cause human reproductive toxicity (Thomas, 1981; Nisbet and 
Karch, 1983; Kimmel, C.A. et al., 1984, 1990; Hemminki and Vineis, 1985; Meistrich, 1986; 
Working, 1988). In general, the experimental animal data indicated adverse reproductive effects that 
are also seen in humans. 

Because similar mechanisms can be identified in the male and female of many mammalian 
species, effects of xenobiotics on male and female reproductive processes are assumed generally to be 
similar across species unless demonstrated otherwise. However, for developmental outcomes, it is 
assumed that the specific outcomes seen in experimental animal studies are not necessarily the same as 
those produced in humans. This latter assumption is made because of the possibility of species-specific 
differences in timing of exposure relative to critical periods of development, pharmacokinetics (including 
metabolism), developmental patterns, placentation, or modes of action. However, adverse 
developmental outcomes in laboratory mammalian studies are presumed to predict a hazard for adverse 
developmental outcome in humans. 

When sufficient data are available (e.g., pharmacokinetic) to allow a decision, the most 
appropriate species should be used to estimate human risk. In the absence of such data, it is assumed 
that the most sensitive species is most appropriate because, for the majority of agents known to cause 
human reproductive toxicity, humans appear to be as or more sensitive than the most sensitive animal 
species tested (Nisbet and Karch, 1983; Kimmel, C.A. et al., 1984, 1990; Hemminki and Vineis, 
1985; Meistrich, 1986; Working, 1988), based on data from studies that determined dose on a body 
weight or air concentration basis. 

In the absence of specific information to the contrary, it is assumed that a chemical that affects 
reproductive function in one sex may also adversely affect reproductive function in the other sex. This 
assumption for reproductive risk assessment is based on three considerations: (1) For most agents, the 
nature of the testing and the data available are limited, reducing confidence that the potential for toxicity 
to both sexes and their offspring has been examined equally; (2) Exposures of either males or females 
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have resulted in developmental toxicity; and (3) Many of the mechanisms controlling important aspects 
of reproductive system function are 



Table 1. Default assumptions in reproductive toxicity risk assessment 

1 . An agent that produces an adverse reproductive effect in experimental animals is assumed to 
pose a potential threat to humans. 

2. Effects of xenobiotics on male and female reproductive processes are assumed generally 
to be similar unless demonstrated otherwise. For developmental outcomes, the specific 
effects in humans are not necessarily the same as those seen in the experimental species. 

3 . In the absence of information to determine the most appropriate experimental species, data 
from the most sensitive species should be used. 

4. In the absence of information to the contrary, an agent that affects reproductive function 
in one sex is assumed to adversely affect reproductive function in the other sex. 

5 . A nonlinear dose-response curve is assumed for reproductive toxicity. 
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similar in females and males, and therefore could be susceptible to the same agents. Information that 
would negate this assumption would demonstrate that either a mechanistic difference existed between 
the sexes that would preclude toxic action on the other sex or, on the basis of sufficient testing, an agent 
did not produce an adverse reproductive effect when administered to the other sex. Mechanistic 
differences could include functions that do not exist in the other sex (e.g., lactation), differences in 
endocrine control of affected organ development or function, or pharmacokinetic and metabolic 
differences between sexes. 

In a quantitative dose-response analysis, mode of action, pharmacokinetic, and 
pharmacodynamic information should be used to predict the shape of the dose-response curve when 
sufficient information of that nature is available. When that information is insufficient, 
it has generally been assumed that there is a nonlinear dose-response for reproductive toxicity. This is 
based on known homeostatic, compensatory, or adaptive mechanisms that must be overcome before a 
toxic endpoint is manifested and on the rationale that cells and organs of the reproductive system and 
the developing organism are known to have some capacity for repair of damage. However, in a 
population, background levels of toxic agents and preexisting conditions may increase the sensitivity of 
some individuals in the population. Thus, exposure to a toxic agent may result in an increased risk of 
adverse effects for some, but not necessarily all, individuals within the population. Although a threshold 
may exist for endpoints of reproductive toxicity, it usually is not feasible to distinguish empirically 
between a true threshold and a nonlinear low-dose relationship. The shift to the term nonlinear does 
not change the RfD/RfC methodology for reproductive system health effects, including the use of 
uncertainty factors. 
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2. DEFINITIONS AND TERMINOLOGY 



For the purposes of these Guidelines, the following definitions will be used: 

Reproductive toxicity - The occurrence of biologically adverse effects on the reproductive systems of 
females or males that may result from exposure to environmental agents. The toxicity may be 
expressed as alterations to the female or male reproductive organs, the related endocrine system, or 
pregnancy outcomes. The manifestation of such toxicity may include, but not be limited to, adverse 
effects on onset of puberty, gamete production and transport, reproductive cycle normality, sexual 
behavior, fertility, gestation, parturition, lactation, developmental toxicity, premature reproductive 
senescence, or modifications in other functions that are dependent on the integrity of the reproductive 
systems. 

Fertility - The capacity to conceive or induce conception. 

Fecundity - The ability to produce offspring within a given period of time. For litter-bearing species, 
the ability to produce large litters is also a component of fecundity. 

Fertile - A level of fertility that is within or exceeds the normal range for that species. 

Infertile - Lacking fertility for a specified period. The infertile condition may be temporary; permanent 
infertility is termed sterility. 

Subfertile - A level of fertility that is below the normal range for that species but not infertile. 

Developmental toxicity - The occurrence of adverse effects on the developing organism that may 
result from exposure prior to conception (either parent), during prenatal development, or postnatally to 
the time of sexual maturation. Adverse developmental effects may be detected at any point in the 
lifespan of the organism. The major manifestations of developmental toxicity include (1) death of the 
developing organism, (2) structural abnormality, (3) altered growth, and (4) functional deficiency (U.S. 
EPA, 1991). 
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3. HAZARD CHARACTERIZATION FOR REPRODUCTIVE TOXICANTS 



Identification and characterization of reproductive hazards can be based on data from either 
human or experimental animal studies. Such data can result from routine or accidental environmental or 
occupational exposures or, for experimental animals, controlled experimental exposures. A hazard 
characterization should evaluate all of the information available and should: 

Identify the strengths and limitations of the database, including all available epidemiologic 

and experimental animal studies as well as pharmacokinetic and mechanistic information. 

Identify and describe key toxicological studies. 

Describe the type(s) of effects. 

Describe the nature of the effects (irreversible, reversible, transient, progressive, delayed, 
residual, or latent effects). 

Describe how much is known about how (through what biological mechanism) the agent 

produces adverse effects. 

Discuss the other health endpoints of concern. 

Discuss any nonpositive data in humans or experimental animals. 

Discuss the dose-response data (epidemiologic or experimental animal) available for further 
dose-response analysis. 

Discuss the route, level, timing, and duration of exposure in studies as compared to 

expected human exposures. 

Summarize the hazard characterization, including: 

- Major assumptions used, 

- Confidence in the conclusions, 

- Alternative conclusions also supported by the data, 

- Major uncertainties identified, and 

- Significant data gaps. 

Conduct of a hazard characterization requires knowledge of the protocols in which data were 
produced and the endpoints that were evaluated. Sections 3.1 and 3.2 present the traditional testing 
protocols for rodents and endpoints used to evaluate male and female reproductive toxicity along with 
evaluation of their strengths and limitations. Because many endpoints are common to multiple 
protocols, endpoints are considered separately from the discussion of the overall protocol structures. 
These are followed by presentation of many of the specific characteristics of human studies (Section 
3.3) and limited discussions of pharmacokinetic and stmcture-activity factors (Sections 3.4 and 3.5). 
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3.1. LABORATORY TESTING PROTOCOLS 

3.1.1. Introduction 

Testing protocols describe the procedures to be used to provide data for risk assessments. The 
quality and usefulness of those data are dependent on the design and conduct of the tests, including 
endpoint selection and resolving power. A single protocol is unlikely to provide all of the information 
that would be optimal for conducting a comprehensive risk assessment. For example, the test design to 
study reversibility of adverse effects or mechanism of toxic action may be different from that needed to 
determine time of onset of an effect or for calculation of a safe level for repeated exposure over a long 
term. Ideally, results from several different types of tests should be available when performing a risk 
assessment. Typically, only limited data are available. Under those conditions, the limited data should 
be used to the extent possible to assess risk. 

Integral parts of the hazard characterization and quantitative dose-response processes are the 
evaluation of the protocols from which data are available and the quality of the resulting data. In this 
section, design factors that are of particular importance in reproductive toxicity testing are discussed. 
Then, standardized protocols that may provide useful data for reproductive risk assessments are 
described. 

3.1.2. Duration of Dosing 

To evaluate adequately the potential effects of an agent on the reproductive systems, a 
prolonged treatment period is needed. For example, damage to spermatogonial stem cells will not 
appear in samples from the cauda epididymis or in ejaculates for 8 to 14 weeks, depending on the test 
species. With some chemical agents that bioaccumulate, the full impact on a given cell type could be 
further delayed, as could the impact on functional endpoints such as fertility. In such situations, 
adequacy of the dosing duration is a critical factor in the risk assessment. 

Conversely, adaptation may occur that allows tolerance to levels of a chemical that initially 
caused an effect that could be considered adverse. An example is interference with ovulation by 
chlordimeform (Goldman et al., 1991); an effect for which a compensatoiy mechanism is available. 
Thus, with continued dosing, the compensatory mechanism can be activated so that the initial adverse 
effect is masked. 

In these situations, knowledge of the relevant pharmacokinetic and pharmacodynamic data can 
facilitate selection of dose levels and treatment duration (see also section on Exposure Assessment). 
Equally important is proper timing of examination of treated animals relative to initiation and termination 
of exposure to the agent. 
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3.1.3. Length of Mating Period 

Traditionally, pairs of rats or mice are allowed to cohabit for periods ranging from several days 
to 3 weeks. Given a 4- or 5-day estrous cycle, each female that is cycling normally should be in estrus 
four or five times during a 21 -day mating period. Therefore, information on the interval or the number 
of cycles needed to achieve pregnancy may provide evidence of reduced fertility that is not available 
from fertility data. Additionally, during each period of behavioral estrus, the male has the opportunity to 
copulate a number of times, resulting in delivery of many more sperm than are required for fertilization. 
When an unlimited number of matings is allowed in fertility testing, a large impact to sperm production is 
necessary before an adverse effect on fertility can be detected. 

3.1.4. Number of Females Mated to Each Male 

The EPA test guidelines prepared pursuant to FIFRA and TSCA specify the use of 20 males 
and enough females to produce at least 20 pregnancies for each dose group in each generation in the 
multigeneration reproduction test (U.S. EPA, 1982, 1985b, 1996a). However, in some tests that were 
not designed to confoim to EPA test guidelines (OECD, 1983), 20 pregnancies may have been 
achieved by mating two females with each male and using fewer than 20 males per treatment group. In 
such cases, the statistical treatment of the data should be examined carefully. With multiple females 
mated to each male, the degree of independence of the observations for each female may not be 
known. In that situation, when the cause of the adverse effect cannot be assigned with confidence to 
only one sex, dependence should be assumed and the male used as the experimental unit in statistical 
analyses. Using fewer males as the experimental unit reduces ability to detect an effect. 

3.1.5. Single- and Multigeneration Reproduction Tests 

Reproductive toxicity studies in laboratory animals generally involve continuous exposure to a 
test substance for one or more generations. The objective is to detect effects on the integrated 
reproductive process as well as to study effects on the individual reproductive organs. Test guidelines 
for the conduct of single- and multigeneration reproduction protocols have been published by the 
Agency pursuant to FIFRA and TSCA and by OECD (U.S. EPA, 1982, 1985b, 1996a; Galbraith et 
al, 1983; OECD, 1983). 

The single-generation reproduction test evaluates effects of subchronic exposure of peripubertal 
and adult animals. In the multigeneration reproduction protocol, F l and F 2 offspring are exposed 
continuously in utero from conception until birth and during the preweaning period. This allows 
detection of effects that occur from exposures throughout development, including the peripubertal and 
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young adult phases. Because the parental and subsequent filial generations have different exposure 
histories, reproductive effects seen in any particular generation are not necessarily comparable with 
those of another generation. Also, successive litters from the same parents cannot be considered as 
replicates because of factors such as continuing exposure of the parents, increased parental age, sexual 
experience, and parity of the females. 

In a single- or multigeneration reproduction test, rats are used most often. In a typical 
reproduction test, dosing is initiated at 5 to 8 weeks of age and continued for 8 to 10 weeks prior to 
mating to allow effects on gametogenesis to be expressed and increase the likelihood of detecting 
histologic lesions. Three dose levels plus one or more control groups are usually included. Enough 
males and females are mated to ensure 20 pregnancies per dose group for each generation. Animals 
producing the first generation of offspring should be considered the parental (P) generation, and all 
subsequent generations should be designated filial generations (e.g., F b F 2 ). Only the P generation is 
mated in a single-generation test, while both the P and F, generations are mated in a two-generation 
reproduction test. 

In the P generation, both females and males are treated prior to and during mating, with 
treatment usually beginning around puberty. Cohabitation can be allowed for up to 3 weeks (U.S. 
EPA, 1982, 1985b), during which the females are monitored for evidence of mating. Females continue 
to be exposed during gestation and lactation. 

In the two-generation reproduction test, randomly selected F; male and female offspring 
continue to be exposed after weaning (day 21) and through the mating period. Treatment of mated F t 
females is continued throughout gestation and lactation. More than one litter may be produced from 
either P or F l animals. Depending on the route of exposure of lactating females, it is important to 
consider that offspring may be exposed to a chemical by ingestion of maternal feed or water (diet or 
drinking water studies), by licking of exposed fur (inhalation study), by contact with treated skin 
(dermal study), or by coprophagia, as well as via the milk. 

In single- and multigeneration reproduction tests, reproductive endpoints evaluated in P and F 
generations usually include visual examination of the reproductive organs. Weights and histopathology 
of the testes, epididymides, and accessory sex glands may be available from males, and histopathology 
of the vagina, uterus, cervix, ovaries, and mammary glands from females. Uterine and ovarian weights 
also are often available. Male and female mating and fertility indices (Section 3.2.2.1) are usually 
presented. In addition, litters (and often individual pups) are weighed at birth and examined for number 
of live and dead offspring, gender, gross abnormalities, and growth and survival to weaning. 
Maturation and behavioral testing may also be performed on the pups. 



9 



If effects on fertility or pregnancy outcome are the only adverse effects observed in a study 
using one of these protocols, the contributions of male- and female-specific effects often cannot be 
distinguished. If testicular histopathology or sperm evaluations have been included, it may be possible 
to characterize a male-specific effect. Similarly, ovarian and reproductive tract histology or changes in 
estrous cycle normality may be indicative of female-specific effects. However, identification of effects 
in one sex does not exclude the possibility that both sexes may have been affected adversely. Data 
from matings of treated males with untreated females and vice versa (crossover matings) are necessary 
to separate sex-specific effects. 

An EPA workshop has considered the relative merits of one- versus two-generation 
reproductive effects studies (Francis and Kimmel, 1988). The participants concluded that a one- 
generation study is insufficient to identify all potential reproductive toxicants, because it would exclude 
detection of effects caused by prenatal and postnatal exposures (including the prepubertal period) as 
well as effects on germ cells that could be transmitted to and expressed in the next generation. For 
example, adverse transgenerational effects on reproductive system development by agents that disrupt 
endocrine control of sexual differentiation would be missed. A one-generation test might also miss 
adverse effects with delayed or latent onset because of the shorter duration of exposure for the P 
generation. These limitations are shared with the shorter-term "screening" protocols described below. 
Because of these limitations, a comprehensive reproductive risk assessment should include results from 
a two-generation test or its equivalent. A further recommendation from the workshop was to include 
sperm analyses and estrous cycle normality as endpoints in reproductive effects studies. These 
endpoints have been included in the proposed revisions to the EPA test guideline (U.S. EPA, 1996a). 

In studies where parental and offspring generations are evaluated, there are additional risk 
assessment issues regarding the relationships of reproductive outcomes across generations. Increasing 
vulnerability of subsequent generations is often, but not always, observed. Qualitative predictions of 
increased risk of the filial generations could be strengthened by knowledge of the reproductive effects in 
the adult, the likelihood of bioaccumulation of the agent, and the potential for increased sensitivity 
resulting from exposure during critical periods of development (Gray, 1991). 

Occasionally, the severity of effects may be static or decreased with succeeding generations. 
When a decrease occurs, one explanation may be that the animals in the F l and F 2 generations 
represent "survivors" who are (or become) more resistant to the agent than the average of the P 
generation. If such selection exists, then subsequent filial generations may show a reduced toxic 
response. Thus, significant adverse effects in any generation may be cause for concern regardless of 
results in other generations unless inconsistencies in the data indicate otherwise. 
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3.1.6. Alternative Reproductive Tests 

A number of alternative test designs have appeared in the literature (Lamb, 1985; Lamb and 
Chapin, 1985; Gray etal, 1988, 1989, 1990; Morrissey et al., 1989). Although not necessarily 
viewed as replacements for the standard two-generation reproduction tests, data from these protocols 
may be used on a case-by-case basis depending on what is known about the test agent in question. 
When mutually agreed on by the testing organization and the Agency, such alternative protocols may 
offer an expanded array of endpoints and increased flexibility (Francis and Kirnmel, 1988). 

A continuous breeding protocol, Fertility (or Reproductive) Assessment by Continuous 
Breeding (FACB or RACB), has been developed by the National Toxicology Program (NTP) (Lamb 
and Chapin, 1985; Morrissey et al, 1989; Gulati et al, 1991). As originally described, this protocol 
(FACB) was a one-generation test. However, in the current design (RACB), dosing is extended into 
the F : generation to make it compatible with the EPA workshop recommendations for a two- 
generation design (Francis and Kirnmel, 1988). The RACB protocol is being used with both mice and 
rats. A distinctive feature of this protocol is the continuous cohabitation of male-female pairs (in the P 
generation) for 14 weeks. Up to five litters can be produced with the pups removed soon after birth. 
This protocol provides information on changes in the spacing, number, and size of litters over the 14- 
week dosing interval. Treatment (three dose levels plus controls) is initiated in postpubertal males and 
females (1 1 weeks of age) seven days before cohabitation and continues throughout the test. Offspring 
that are removed from the dam soon after birth are counted and examined for viability, litter and/or pup 
weight, sex, and external abnormalities and then discarded. The last litter may remain with the dam until 
weaning to study the effects of in utero as well as perinatal and postnatal exposures. If effects on 
fertility are observed in the P or F generations, additional reproductive evaluations may be conducted, 
including fertility studies and crossover matings to define the affected gender and site of toxicity. 

The sequential production of litters from the same adults allows observ ation of the timing of 
onset of an adverse effect on fertility. In addition, it improves the ability to detect subfertility due to the 
potential to produce larger numbers of pregnancies and litters than in a standard single- or 
multigeneration reproduction study. With continuous treatment, a cumulative effect could increase the 
incidence or extent of expression with subsequent litters. However, unless offspring were allowed to 
grow and reproduce (as they are routinely in the more recent version of the RACB protocol) (Gulati et 
al, 1991), little or no information will be available on postnatal development or reproductive capability 
of a second generation. 

Sperm measures (including sperm number, morphology, and motility) and vaginal smear 
cytology to detect changes in estrous cyclicity have been added to the RACB protocol at the end of the 



11 



test period and their utility has been examined using model compounds in the mouse (Morrissey et al, 
1989). 

Another test method combines the use of multiple endpoints in both sexes of rats with initiation 
of treatment at weaning (Gray et al, 1988). Thus, morphologic and physiologic changes associated 
with puberty are included as endpoints. Both P sexes are treated (at least three dose levels plus 
controls) continuously through breeding, pregnancy, and lactation. The ¥ l generation is mated in a 
continuous breeding protocol. Vaginal smears are recorded daily throughout the test period to evaluate 
estrous cycle normality and confirm breeding and pregnancy (or pseudopregnancy). Pregnancy 
outcome is monitored in both the P and F t generations at all doses, and terminal studies on both 
generations include comprehensive assessment of sperm measures (number, morphology, motility) as 
well as organ weights, histopathology, and the serum and tissue levels of appropriate reproductive 
hormones. As with the RACB, crossover mating studies may be conducted to identify the affected sex 
as warranted. This protocol combines the advantages of a continuous breeding design with acquisition 
of sex-specific multiple endpoint data at all doses. In addition, identification of pubertal effects makes 
this protocol particularly useful for detecting compounds with hormone-mediated actions such as 
environmental estrogens or antiandrogens. 

3.1.7. Additional Test Protocols That May Provide Reproductive Data 

Several shorter-term reproductive toxicity screening tests have been developed. Among those 
are the Reproductive/Developmental Toxicity Screening Test, which is part of the OECD's Screening 
Information Data Set protocol (Scala et al., 1992; Tanaka et al., 1992; OECD, 1993a), a tripartite 
protocol developed by the International Conference on Harmonization (International Conference on 
Harmonization of Technical Requirements of Pharmaceuticals for Human Use, 1994; Manson, 1994), 
and the NTP's Short-Term Reproductive and Developmental Toxicity Screen (Harris, M.W. et al, 
1992). These protocols have been developed for setting priorities for further testing and should not be 
considered sufficient by themselves to establish regulatory exposure levels. Their limited exposure 
periods do not allow assessment of certain aspects of the reproductive process, such as 
developmentally induced effects on the reproductive systems of offspring, that are covered by the 
multigeneration reproduction protocols. 

The male dominant lethal test was designed to detect mutagenic effects in the male 
spermatogenic process that are lethal to the offspring. A female dominant lethal protocol has also been 
used to detect equivalent effects on oogenesis (Generoso and Piegorsch, 1993). 
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A review of the male dominant lethal test has been published as part of the EPA's Gene-Tox 
Program (Green et al, 1985). Dominant lethal protocols may use acute dosing (1 to 5 days) followed 
by serial matings with one or two females per male per week for the duration of the spermatogenic 
process. An alternative protocol may use subchronic dosing for the duration of the spermatogenic 
process followed by mating. Dose levels used with the acute protocol are usually higher than those 
used with the subchronic protocol. Females are monitored for evidence of mating, killed at 
approximately midgestation, and examined for incidence of pre- and postimplantation loss (see Section 
3.2.2 for discussions of these endpoints). 

Pre- or postimplantation loss in the dominant lethal test is often considered evidence that the 
agent has induced mutagenic damage to the male germ cell (U.S. EPA, 1986c). A genotoxic basis for 
a substantial portion of postimplantation loss is accepted widely. However, methods used to assess 
preimplantation loss do not distinguish between contributions of mutagenic events that cause embryo 
death and nonmutagenic factors that result in failure of fertilization or early embryo mortality (e.g., 
inadequate number of noimal speim, failure in sperm transport or ovum penetration). Similar effects 
(fertilization failure, early embiyo death) could also be produced indirectly by effects that delay the 
timing of fertilization relative to time of ovulation. Such distinctions are important because cytotoxic 
effects on gametogenic cells do not imply the potential for transmittable genetic damage that is 
associated with mutagenic events. The interpretation of an increase in preimplantation loss may require 
additional data on the agent's mutagenic and gametotoxic potential if genotoxicity is to be factored into 
the risk assessment. Regardless, significant effects may be observed in a dominant lethal test that are 
considered reproductive in nature. 

An acute exposure protocol, combined with serial mating, may allow identification of the 
spermatogenic cell types that are affected by treatment. However, acute dosing may not produce 
adverse effects at levels as low as with subchronic dosing because of factors such as bioaccumulation. 
Conversely, if tolerance to an agent is developed with longer exposure, an effect may be observed after 
acute dosing that is not detected after longer-term dosing. 

Subchronic toxicity tests may have been conducted before a detailed reproduction study is 
initiated. In the subchronic toxicity test with rats, exposure usually begins at 6-8 weeks of age and is 
continued for 90 days (U.S. EPA, 1982, 1985b). Initiation of exposure at 8 weeks of age (compared 
with 6) and exposure for approximately 90 days allows the animals to reach a more mature stage of 
sexual development and assures an adequate length of dosing for observation of effects on the 
reproductive organs with most agents. The route of administration is often oral or by gavage but may 
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be dermal or by inhalation. Animals are monitored for clinical signs throughout the test and are 
necropsied at the end of dosing. 

The endpoints that are usually evaluated for the male reproductive system include visual 
examination of the reproductive organs, plus weights and histopathology for the testes, epididymides, 
and accessory sex glands. For the females, endpoints may include visual examination of the 
reproductive organs, uterine and ovarian weights, and histopathology of the vagina, uterus, cervix, 
ovaries, and mammary glands. 

This test may be useful to identify an agent as a potential reproductive hazard, but usually does 
not provide information about the integrated function of the reproductive systems (sexual behavior, 
fertility, and pregnancy outcomes), nor does it include effects of the agent on immature animals. 

Chronic toxicity tests provide an opportunity to evaluate toxic effects of long-term exposures. 
Oral, inhalation, or dermal exposure is initiated soon after weaning and is usually continued for 12 to 24 
months. Because of the extended treatment period, data from interim sacrifices may be available to 
provide useful information regarding the onset and sequence of toxicity. In males, the reproductive 
organs are examined visually, testes are weighed, and histopathologic examination is done on the testes 
and accessory sex glands. In females, the reproductive organs are examined visually, uterine and 
ovarian weights may be obtained, and histopathologic evaluation of the reproductive organs is done. 
The incidence of pathologic conditions is often increased in the reproductive tracts of aged control 
animals. Therefore, findings should be interpreted carefully. 

3.2. ENDPOINTS FOR EVALUATING MALE AND FEMALE REPRODUCTIVE 
TOXICITY IN TEST SPECIES 
3.2.1. Introduction 

The following discussion emphasizes endpoints that measure characteristics that are necessary 
for successful sexual performance and procreation. Other areas that are related less directly to 
reproduction are beyond the scope of these Guidelines. For example, secondary adverse health effects 
that may result from toxicity to the reproductive organs (e.g., osteoporosis or altered immune function), 
although important, are not included. 

In these Guidelines, the endpoints of reproductive toxicity are separated into three categories: 
couple-mediated, female-specific, and male-specific. Couple-mediated endpoints are those in which 
both sexes can have a contributing role if both partners are exposed. Thus, exposure of either sex or 
both sexes may result in an effect on that endpoint. 
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The discussions of endpoints and the factors influencing results that are presented in this section 
are directed to evaluation and interpretation of results with test species. Many of those endpoints 
require invasive techniques that preclude routine use with humans. However, in some instances, related 
endpoints that can be used with humans are identified. Information that is specific for evaluation of 
effects on humans is presented in Section 3.3. 

Although statistical analyses are important in detemiining the effects of a particular agent, the 
biological significance of data is most important. It is important to be aware that when many endpoints 
are investigated, statistically significant differences may occur by chance. On the other hand, apparent 
trends with dose may be biologically relevant even though pair-wise comparisons do not indicate a 
statistically significant effect. In each section, endpoints are identified in which significant changes may 
be considered adverse. However, concordance of results and known biology should be considered in 
interpreting all results. Results should be evaluated on a case-by-case basis with all of the evidence 
considered. Scientific judgment should be used extensively. All effects that may be considered as 
adverse are appropriate for use in establishing a NOAEL, LOAEL, or benchmark dose. 

3.2.2. Couple-Mediated Endpoints 

Data on fertility potential and associated reproductive outcomes provide the most 
comprehensive and direct insight into reproductive capability. As noted previously, most protocols only 
specify cohabitation of exposed males with exposed females. This complicates the resolution of 
gender-specific influences. Conclusions may need to be restricted to noting that the "couple" is at 
reproductive risk when one or both parents are potentially exposed. 

3.2.2.1. Fertility and Pregnancy Outcomes 

Breeding studies with test species are a major source of data on reproductive toxicants. 
Evaluations of fertility and pregnancy outcomes provide measures of the functional consequences of 
reproductive injury. Measures of fertility and pregnancy outcome that are often obtained from 
multigeneration reproduction studies are presented in Table 2. Many endpoints that are pertinent for 
developmental toxicity are also listed and discussed in the Agency's Guidelines for Developmental 
Toxicity Risk Assessment (U.S. EPA, 1991). Also included in Table 2 are measures that may be 
obtained from other types of studies (e.g., single-generation reproduction studies, developmental 
toxicity studies, dominant lethal studies) in which offspring are not retained to evaluate subsequent 
reproductive performance. 
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Some of the endpoints identified above are used to calculate ratios or indices (NRC1, 1977; 
Collins, 1978; Schwetz et al, 1980; U.S. EPA, 1982, 1985b; Dixon and Hall, 1984; Lamb et al, 
1985; Thomas, 1991). While the presentation of such indices is not discouraged, the measurements 
used to calculate those indices should also be available for evaluation. Definitions of some of these 
indices in published literature vary substantially. Also, the calculation of an index may be influenced by 
the test design. Therefore, it is important that the methods used to calculate indices be specified. Some 
commonly reported indices are in Table 3. 
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Table 2. Couple-mediated endpoints of reproductive toxicity 



Multigeneration studies 

Mating rate, time to mating (time to pregnancy*) 
Pregnancy rate* 
Delivery rate* 
Gestation length* 
Litter size (total and live) 

Number of live and dead offspring (fetal death rate*) 
Offspring gender* (sex ratio) 
Birth weight* 
Postnatal weights* 
Offspring survival* 

External malformations and variations* 
Offspring reproduction* 



Other reproductive endpoints 

Ovulation rate 
Fertilization rate 
Preimplantation loss 
Implantation number 
Postimplantation loss* 
Internal malformations and 
variations* 

Postnatal structural and 
functional development* 



*Endpoints that can be obtained with humans. 
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Table 3. Selected indices that may be calculated from endpoints of reproductive toxicity in 
test species 



MATING INDEX 

Number of males or females mating x 100 
Number of males or females cohabited 

Note: Mating is used to indicate that evidence of copulation (observation or other evidence of 
ejaculation such as vaginal plug or sperm in vaginal smear) was obtained. 

FERTILITY INDEX 

Number of cohabited females becoming pregnant x 100 
Number of nonpregnant couples cohabited 

Note: Because both sexes are often exposed to an agent, distinction between sexes often is not 
possible. If responsibility for an effect can be clearly assigned to one sex (as when treated animals are 
mated with controls), then a female or male fertility index could be useful. 

GESTATION (PREGNANCY) INDEX 

Number of females delivering live y oun g x 100 

Number of females with evidence of pregnancy 

LIVE BIRTH INDEX 

Number of live offspring x 100 
Number of offspring delivered 

SEX RATIO 

Number of male offspring 
Number of female offspring 

4-DAY SURVIVAL INDEX (VIABILITY INDEX) 

Number of live offspring at lactation day 4 x 100 
Number of live offspring delivered 

Note: This definition assumes that no standardization of litter size is done until after the day 4 
determination is completed. 
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Table 3. Selected indices that may be calculated from endpoints of reproductive toxicity in 
test species (continued) 



LACTATION INDEX (WEANING INDEX) 

Number of live offspring at day 21 x 100 
Number of live offspring bom 

Note: If litters were standardized to equalize numbers of offspring per litter, number of 
offspring after standardization should be used instead of number bom alive. When no standardization is 
done, measure is called weaning index. When standardization is done, measure is called lactation 
index. 

PREWEANING INDEX 

Number of live offspring bom - 
Number of offspring weaned * 100 
Number of live offspring bom 

Note: If litters were standardized to equalize numbers of offspring per litter, then number of 
offspring remaining after standardization should be used instead of number bom. 
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Mating rate may be reported for the mated pairs, males only or females only. Evidence of 
mating may be direct observation of copulation, observation of copulatory plugs, or observation of 
sperm in the vaginal fluid (vaginal lavage). The mating rate may be influenced by the number of estrous 
cycles allowed or required for pregnancy to occur. Therefore, mating rate and fertility data from the 
first estrous cycle after initiation of cohabitation should be more discriminating than measurements 
involving multiple cycles. Evidence of mating does not necessarily mean successful impregnation. 

A useful indicator of impaired reproductive function may be the length of time required for each 
pair to mate after the start of cohabitation (time to mating). An increased interval between initiation of 
cohabitation and evidence of mating suggests abnormal estrous cyclicity in the female or impaired sexual 
behavior in one or both partners. 

The time to mating for noimal pairs (rat or mouse) could vary by 3 or 4 days depending on the 
stage of the estrous cycle at the start of cohabitation. If the stage of the estrous cycle at the time of 
cohabitation is known, the component of the variance due to variation in stage at cohabitation can be 
removed in the data analysis. 

Data on fertilization rate, the proportion of available ova that were fertilized, are seldom 
available because the measurement requires necropsy very early in gestation. Pregnancy rate is the 
proportion of mated pairs that have produced at least one pregnancy within a fixed period where 
pregnancy is determined by the earliest available evidence that fertilization has occurred. Generally, a 
more meaningful measure of fertility results when the mating opportunity was limited to one mating 
couple and to one estrous cycle (see Sections 3.1.3 and 3.1.4). 

The timing and integrity of gamete and zygote transport are important to fertilization and embryo 
survival and are quite susceptible to chemical perturbation. Disruption of the processes that contribute 
to a reduction in fertilization rate and increased early embiyo loss are usually identified simply as 
preimplantation loss. Additional studies using direct assessments of fertilized ova and early embryos 
would be necessary to identify the cause of increased preimplantation loss (Cummings and Perreault, 
1990). Preimplantation loss (described below) occurs in untreated as well as treated rodents and 
contributes to the normal variation in litter size. 

After mating, uterine and oviductal contractions are critical in the transport of spermatozoa from 
the vagina. In rodents, sufficient stimulation during mating is necessary for initiation of those 
contractions. Thus, impaired mating behavior may affect sperm transport and fertilization rate. 
Exposure of the female to estrogenic compounds can alter gamete transport. In women, low doses of 
exogenous estrogens may accelerate ovum transport to a detrimental extent, whereas high doses of 
estrogens or progestins delay transport and increase the incidence of ectopic pregnancies. 
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Mammalian ova are surrounded by investments that the sperm must penetrate before fusing 
with ova. Chemicals may block fertilization by preventing this passage. Other agents may impair fusion 
of the sperm with the oolemma, transformations of the sperm or ovum chromatin into the male and 
female pronuclei, fusion of the pronuclei, or the subsequent cleavage divisions. Carbendazim, an 
inhibitor of microtubule synthesis, is an example of a chemical that can interfere with oocyte maturation 
and normal zygote formation after sperm-egg fusion by affecting meiosis (Perreault et al, 1992; Zuelke 
and Perreault, 1995). The early zygote is also susceptible to detrimental effects of mutagens such as 
ethylene oxide (Generoso et al, 1987). 

Fertility assessments in test animals have limited sensitivity as measures of reproductive injury. 
Therefore, results demonstrating no treatment-related effect on fertility may be given less weight than 
other endpoints that are more sensitive. Unlike humans, normal males of most test species produce 
sperm in numbers that greatly exceed the minimum requirements for fertility, particularly as evaluated in 
protocols that allow multiple matings (Amann, 1981; Working, 1 988). In some strains of rats and mice, 
production of normal sperm can be reduced by up to 90% or more without compromising fertility 
(Aafjes et al., 1980; Meistrich, 1982; Robaire et al., 1984; Working, 1988). However, less severe 
reductions can cause reduced fertility in human males who appear to function closer to the threshold for 
the number of normal sperm needed to ensure full reproductive competence (see Supplementary 
Information). This difference between test species and humans means that negative results with test 
species in a study that was limited to endpoints that examined only fertility and pregnancy outcomes 
would provide insufficient information to conclude that the test agent poses no reproductive hazard in 
humans. It is unclear whether a similar consideration is applicable for females for some mechanisms of 
toxicity. 

The limited sensitivity of fertility measures in rodents also suggests that a NOAEL, LOAEL, or 
benchmark dose (see Section 4) based on fertility may not reflect completely the extent of the toxic 
effect. In such instances, data from additional reproductive endpoints might indicate that an adverse 
effect could occur at a lower dose level. In the absence of such data, the margin of exposure or 
uncertainty factor applied to the NOAEL, LOAEL, or benchmark dose may need to be adjusted to 
reflect the additional uncertainty (see Section 4). 

Both the blastocyst and the uterus must be ready for implantation, and their synchronous 
development is critical (Cummings and Perreault, 1990). The preparation of the uterine endometrium 
for implantation is under the control of sequential estrogen and progesterone stimulation. Treatments 
that alter the internal hormonal environment or inhibit protein synthesis, mitosis, or cell differentiation can 
block implantation and cause embryo death. 
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Gestation length can be determined in test animals from data on day of mating (observation of 
vaginal plug or sperm-positive vaginal lavage) and day of parturition. Significant shortening of gestation 
can lead to adverse outcomes of pregnancy such as decreased birth weight and offspring survival. 
Significantly longer gestation may be caused by failure of the normal mechanism for parturition and may 
result in death or impairment of offspring if dystocia (difficulty in parturition) occurs. Dystocia 
constitutes a maternal health threat for humans as well as test species. Lengthened gestation may result 
in higher birth weight; an effect that could mask a slower growth rate in utero because of exposure to a 
toxic agent. Comparison of offspring weights based on conceptional age may allow insight, although 
this comparison is complicated by generally faster growth rates postnatally than in utero. 

Litter size is the number of offspring delivered and is measured at or soon after birth. Unless 
this observation is made soon after parturition, the number of offspring observed may be less than the 
actual number delivered because of cannibalism by the dam. Litter size is affected by the number of 
ova available for fertilization (ovulation rate), fertilization rate, implantation rate, and the proportion of 
the implanted embryos that survives to parturition. Litter size may include dead as well as live offspring, 
therefore data on the numbers of live and dead offspring should be available also. 

When pregnant animals are examined by necropsy in mid- to late gestation, pregnancy status, 
including pre- and postimplantation losses can be determined. Postimplantation loss can be determined 
also by examining uteri from postparturient females. Preimplantation loss is the (number of corpora 
lutea minus number of implantation sites)/number of corpora lutea. Postimplantation loss, determined 
following deliveiy of a litter, is the (total number of implantation sites minus number of full-term 
pups)/number of implantation sites. 

Offspring gender in mammals is determined by the male through fertilization of an ovum by a 
Y- or an X-chromosome-bearing sperm. Therefore, selective impairment in the production, transport, 
or fertilizing ability of either of these sperm types can produce an alteration in the sex ratio. An agent 
may also induce selective loss of male or female fetuses. Further, alteration of the external sexual 
characteristics of offspring by agents that disrupt sexual development may produce apparent effects on 
sex ratios. Although not examined routinely, these factors provide the most likely explanations for 
alterations in the sex ratio. 

Birth weight should be measured on the day of parturition. Often data from individual pups as 
well as the entire litter (litter weight) are provided. Birth weights are influenced by intrauterine growth 
rates, litter size, and gestation length. Growth rate in utero is influenced by the normality of the fetus, 
the maternal environment, and gender, with females tending to be smaller than males (Tyl, 1987). 
Individual pups in large litters tend to be smaller than pups in smaller litters. Thus, reduced birth weights 
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that can be attributed to large litter size should not be considered an adverse effect unless the increased 
litter size is treatment related and the subsequent ability of the offspring to survive or develop is 
compromised. Multivariate analyses may be used to adjust pup weights for litter size (e.g., analysis of 
covariance, multiple regression). When litter weights only are reported, the increased numbers of 
offspring and the lower weights of the individuals tend to offset each other. When prenatal or postnatal 
growth is impaired by an acute exposure, compensatory growth after cessation of dosing could obscure 
the earlier effect. 

Postnatal weights are dependent on birth weight, sex, and normality of the individual, as well 
as the litter size, lactational ability of the dam, and suckling ability of the offspring. With large litters, 
small or weak offspring may not compete successfully for milk and show impaired growth. Because it 
is not possible usually to determine whether the effect was due solely to the increased litter size, growth 
retardation or decreased smvival rate should be considered adverse in the absence of information to the 
contrary. Also, offspring weights may appear normal in veiy small litters and should be considered 
carefully in relation to controls. 

Offspring survival is dependent on the same factors as postnatal weight, although more severe 
effects are necessary usually to affect survival. All weight and survival endpoints can be affected by 
toxicity of an agent, either by direct effects on the offspring or indirectly through effects on the ability of 
the dam to support the offspring. 

Measures of malformations and variations, as well as postnatal structural and functional 
development, are presented in the Guidelines for Developmental Toxicity Risk Assessment and the 
Proposed Guidelines for Neurotoxicity Risk Assessment (U .S. EPA, 1991, 1995a). These 
documents should be consulted for additional information on those parameters. 

3.2.2.1.1. Adverse effects. Table 2 lists couple-mediated endpoints that may be measured in 
reproduction studies. Table 3 presents examples of indices that may be calculated from couple- 
mediated reproductive toxicity data. Significant detrimental effects on any of those endpoints or on 
indices derived from those data should be considered adverse. Whether effects are on the female 
reproductive system or directly on the embryo or fetus is often not distinguishable, but the distinction 
may not be important because all of these effects should be cause for concern. 

3.2.2.2. Sexual Behavior 

Sexual behavior reflects complex neural, endocrine, and reproductive organ interactions and is 
therefore susceptible to disruption by a variety of toxic agents and pathologic conditions. Interference 
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with sexual behavior in either sex by environmental agents represents a potentially significant human 
reproductive problem. Most human information comes from studies on effects of drugs on sexual 
behavior or from clinical reports in which the detection of exposure-effect associations is unlikely. Data 
on sexual behavior are usually not available from studies of human populations that were exposed 
occupationally or environmentally to potentially toxic agents, nor are such data obtained routinely in 
studies of environmental agents with test species. 

In the absence of human data, the perturbation of sexual behavior in test species suggests the 
potential for similar effects on humans. Consistent with this position are data showing that central 
nervous system effects can disrupt sexual behavior in both test species and humans (Rubin and Henson, 
1979; Waller et al, 1985). Although the functional components of sexual performance can be 
quantified in most test species, no direct evaluation of this behavior is done in most breeding studies. 
Rather, copulatory plugs or sperm-positive vaginal lavages are taken as evidence of sexual receptivity 
and successful mating. However, these markers do not demonstrate whether male performance 
resulted in adequate sexual stimulation of the female. Failure of the male to provide adequate 
stimulation to the female may impair speim transport in the genital tract of female rats, thereby reducing 
the probability of successful impregnation (Adler and Toner, 1986). Such a "mating" failure would be 
reflected in the calculated fertility index as reduced fertility and could be attributed erroneously to an 
effect on the spermatogenic process in the male or on fertility of the female. 

In the rat, a direct measure of female sexual receptivity is the occurrence of lordosis. Sexual 
receptivity of the female rat is normally cyclic, with receptivity commencing during the late evening of 
vaginal proestrus. Agents that interfere with normal estrous cyclicity also could cause absence of or 
abnormal sexual behavior that can be reflected in reduced numbers of females with vaginal plugs or 
vaginal sperm, alterations in lordosis behavior, and increased time to mating after start of cohabitation. 
In the male, measures include latency periods to first mount, mount with intromission, and first 
ejaculation, number of mounts with intromission to ejaculation, and the postejaculatory interval (Beach, 
1979). 

Direct evaluation of sexual behavior is not warranted for all agents being tested for reproductive 
toxicity. Some likely candidates may be agents reported to exert central or peripheral neurotoxicity. 
Chemicals possessing or suspected to possess androgenic or estrogenic properties (or antagonistic 
properties) also merit consideration as potentially causing adverse effects on sexual behavior 
concomitant with effects on the reproductive organs. 
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3.2.2.2.1. Adverse effects. Effects on sexual behavior (within the limited definition of these 
Guidelines) should be considered as adverse reproductive effects. Included is evidence of impaired 
sexual receptivity and copulatory behavior. Impairment that is secondary to more generalized physical 
debilitation (e.g., impaired rear leg motor activity or general lethargy) should not be considered an 
adverse reproductive effect, although such conditions represent adverse systemic effects. 

3.2.3. Male-Specific Endpoints 

3.2.3.1. Introduction 

The following sections (3.2.3 and 3.2.4) describe various male-specific and female-specific 
endpoints of reproductive toxicity that can be obtained. Included are endpoints for which data are 
obtained routinely by the Agency and other endpoints for which data may be encountered in the review 
of chemicals. Guidance is presented for interpretation of results involving these endpoints and their use 
in risk assessment. Effects are identified that should be considered as adverse reproductive effects if 
significantly different from controls. 

The Agency may obtain data on the potential male reproductive toxicity of an agent from many 
sources including, but not limited to, studies done according to Agency test guidelines. These may 
include acute, subchronic, and chronic testing and reproduction and fertility studies. Male-specific 
endpoints that may be encountered in such studies are identified in Table 4. 

3.2.3.2. Body Weight and Organ Weights 

Monitoring body weight during treatment provides an index of the general health status of the 
animals, and such information may be important for the interpretation of reproductive effects (see also 
Section 3.2.2). Depression in body weight or reduction in weight gain may reflect a variety of 
responses, including rejection of chemical-containing food or water because of reduced palatability, 
treatment-induced anorexia, or systemic toxicity. Less than severe reductions in adult body weight 
induced by restricted nutrition have shown little effect on the male reproductive organs or on male 
reproductive function (Chapin et al, 1993a,b). When a meaningful, biologic relationship between a 
body weight decline and a significant effect on the male reproductive system is not apparent, it is not 
appropriate to dismiss significant alteration of the male reproductive system as secondary to the 
occurrence of nonreproductive toxicity. Unless additional data provide the needed clarification, 
alteration in a reproductive measure that would otherwise be considered adverse should still be 
considered as an adverse male reproductive effect in the presence of mild to moderate body weight 
changes. In the presence of severe body weight depression or other severe systemic debilitation, it 
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should be noted that an adverse effect on a reproductive endpoint occurred, but the effect may have 
resulted from a more generalized toxic effect. Regardless, adverse effects would have been observed 
in that situation and a risk assessment should be pursued if sufficient data are available. 

The male reproductive organs for which weights may be useful for reproductive risk assessment 
include the testes, epididymides, pituitary gland, seminal vesicles (with coagulating glands), and 
prostate. Organ weight data may be presented as both absolute weights and as relative weights (i.e., 
organ weight to body weight ratios). Organ weight data may also be 



Table 4. Male-specific endpoints of reproductive toxicity 



Organ weights 

Visual examination and 
histopathology 

Sperm evaluation* 



Sexual behavior* 
Hormone levels* 

Developmental effects 



Testes, epididymides, seminal vesicles, prostate, pituitary 

Testes, epididymides, seminal vesicles, 
prostate, pituitary 

Sperm number (count) and quality 
(morphology, motility) 

Mounts, intromissions, ejaculations 

Luteinizing hormone, follicle stimulating 
hormone, testosterone, estrogen, prolactin 

Testis descent*, preputial separation, sperm 
production*, ano-genital distance, structure of 
external genitalia* 



e endpoints that can be obtained or estimated relath ely noninvasive]}' with humans. 
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reported relative to brain weight since, subsequent to development, the weight of the brain usually 
remains quite stable (Stevens and Gallo, 1989). Evaluation of data on absolute organ weights is 
important, because a decrease in a reproductive organ weight may occur that was not necessarily 
related to a reduction in body weight gain. The organ weight-to-body weight ratio may show no 
significant difference if both body weight and organ weight change in the same direction, masking a 
potential organ weight effect. 

Normal testis weight varies only modestly within a given test species (Schwetz et al, 1980; 
Blazak et al, 1985). This relatively low interanimal variability suggests that absolute testis weight 
should be a precise indicator of gonadal injury. However, damage to the testes may be detected as a 
weight change only at doses higher than those required to produce significant effects in other measures 
of gonadal status (Bemdtson, 1977; Foote et al., 1986; Ku et al, 1993). This contradiction may arise 
from several factors, including a delay before cell deaths are reflected in a weight decrease (due to 
preceding edema and inflammation, cellular infiltration) or Leydig cell hyperplasia. Blockage of the 
efferent ducts by cells sloughed from the germinal epithelium or the efferent ducts themselves can lead 
to an increase in testis weight due to fluid accumulation (Hess et al, 1991; Nakai et al., 1993), an effect 
that could offset the effect of depletion of the gerrninal epithelium on testis weight. Thus, while testis 
weight measurements may not reflect certain adverse testicular effects and do not indicate the nature of 
an effect, a significant increase or decrease is indicative of an adverse effect. 

Pituitary gland weight can provide valuable insight into the reproductive status of the animal. 
However, the pituitary contains cell types that are responsible for the regulation of a variety of 
physiologic functions including some that are separate from reproduction. Thus, changes in pituitary 
weight may not necessarily reflect reproductive impairment. If weight changes are obseived, 
gonadotroph-specific histopathologic evaluations may be useful in identifying the affected cell types. 
This information may then be used to judge whether the obseived effect on the pituitary is related to 
reproductive system function and therefore an adverse reproductive effect. 

Prostate and seminal vesicle weights are androgen-dependent and may reflect changes in the 
animal's endocrine status or testicular function. Separation of the seminal vesicles and coagulating gland 
(dorsal prostate) is difficult in rodents. However, the seminal vesicle and prostate can be separated and 
results may be reported for these glands separately or together, with or without their secretory fluids. 
Differential loss of secretory fluids prior to weighing could produce artifactual weights. Because the 
seminal vesicles and prostate may respond differently to an agent (endocrine dependency and 
developmental susceptibility differ), more information may be gained if the weights were examined 
separately. 



27 



3.2.3.2.1. Adverse effects. Significant changes in absolute or relative male reproductive organ 
weights may constitute an adverse reproductive effect. Such changes also may provide a basis for 
obtaining additional information on the reproductive toxicity of that agent. However, significant changes 
in other important endpoints that are related to reproductive function may not be reflected in organ 
weight data. Therefore, lack of an organ weight effect should not be used to negate significant changes 
in other endpoints that may be more sensitive. 

3.2.3.3. Histopathologic Evaluations 

Histopathologic evaluations of test animal tissues have a prominent role in male reproductive 
risk assessment. Organs that are often evaluated include the testes, epididymides, prostate, seminal 
vesicles (often including coagulating glands), and pituitary. Tissues from lower dose exposures are 
often not examined histologically if the high dose produced no difference from controls. Histologic 
evaluations can be especially useful by (1) providing a relatively sensitive indicator of damage; (2) 
providing information on toxicity from a variety of protocols; and (3) with short-term dosing, providing 
infoimation on site (including target cells) and extent of toxicity; and 4) indicating the potential for 
recovery. 

The quality of the infoimation presented from histologic analyses of spermatogenesis is 
improved by proper fixation and embedding of testicular tissue. With adequately prepared tissue 
(Chapin, 1988; Russell et al., 1990; Hess and Moore, 1993), a description of the nature and 
background level of lesions in control tissue, whether preparation-induced or otheiwise, can facilitate 
interpreting the nature and extent of the lesions observed in tissues obtained from exposed animals. 
Many histopathologic evaluations of the testis only detect lesions if the germinal epithelium is severely 
depleted or degenerating, if multinucleated giant cells are obvious, or if sloughed cells are present in the 
tubule lumen. More subtle lesions, such as retained spermatids or missing germ cell types, that can 
significantly affect the number of sperm being released normally into the tubule lumen may not be 
detected when less adequate methods of tissue preparation are used. Also, familiarity with the detailed 
morphology of the testis and the kinetics of spermatogenesis of each test species can assist in the 
identification of less obvious lesions that may accompany lower dose exposures or lesions that result 
from short-term exposure (Russell et al, 1990). Several approaches for qualitative or quantitative 
assessment of testicular tissue are available that can assist in the identification of less obvious lesions that 
may accompany lower-dose exposures, including use of the technique of "staging." A book is available 
(Russell et al, 1990) which provides extensive information on tissue preparation, examination, and 
interpretation of observations for normal and high resolution histology of the germinal epithelium of rats, 
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mice, and dogs. Included is guidance for identification and quantification of the various cell types and 
associations for each stage of the spermatogenic cycle. Also, a decision-tree scheme for staging with 
the rat has been published (Hess, 1990). 

The basic morphology of other male reproductive organs (e.g., epididymides, accessory sex 
glands, and pituitary) has been described as well as the histopathologic alterations that may accompany 
certain disease states (Fawcett, 1986; Jones et al., 1987; Haschek and Rousseaux, 1991). Compared 
with the testes, less is known about structural changes in these tissues that are associated with exposure 
to toxic agents. With the epididymides and accessory sex glands, histologic evaluation is usually limited 
to the height and possibly the integrity of the secretory epithelium. Evaluation should include information 
on the caput, corpus, and cauda segments of the epididymis. Presence of debris and sloughed cells in 
the epididymal lumen are valuable indicators of damage to the germinal epithelium or the excurrent 
ducts. The presence of lesions such as sperm granulomas, leucocyte infiltration (inflammation) or 
absence of clear cells in the cauda epididymal epithelium should be noted. Information from 
examinations of the pituitary should include evaluation of the morphology of the cell types that produce 
the gonadotropins and prolactin. 

The degree to which histopathologic effects are quantified is usually limited to classifying 
animals, within dose groups, as either affected or not affected by qualitative criteria. Little effort has 
been made to quantify the extent of injury, and procedures for such classifications are not applied 
uniformly (Linder et al., 1990). Evaluation procedures would be facilitated by adoption of more 
uniform approaches for quantifying the extent of histopathologic damage per individual. In the absence 
of standardized tissue preparation techniques and a standardized quantification system, the evaluation of 
histopathologic data would be facilitated by the presentation of the ev aluation criteria and procedure by 
which the level of lesions in exposed individuals was judged to be in excess of controls. 

If properly obtained (i.e., proper preparation and analysis of tissue), data from histopathologic 
evaluations may provide a relatively sensitive tool that is useful for detection of low-dose effects. This 
approach may also provide insight into sites and mechanisms of action for the agent on that 
reproductive organ. When similar targets or mechanisms exist in humans, the basis for interspecies 
extrapolation is strengthened. Depending on the experimental design, information can also be obtained 
that may allow prediction of the eventual extent of injury and degree of recovery in that species and 
humans (Russell, 1983). 

3.2.3.3.1. Adverse effects. Significant and biologically meaningful histopathologic damage in excess 
of the level seen in control tissue of any of the male reproductive organs should be considered an 



29 



adverse reproductive effect. Significant histopathologic damage in the pituitary should be considered as 
an adverse effect but should be shown to involve cells that control gonadotropin or prolactin production 
to be called a reproductive effect. Although thorough histopathologic evaluations that fail to reveal any 
treatment-related effects may be quite convincing, consideration should be given to the possible 
presence of other testicular or epididymal effects that are not detected histologically (e.g., genetic 
damage to the germ cell, decreased sperm motility), but may affect reproductive function. 

3.2.3.4. Sperm Evaluations 

The parameters that are important for sperm evaluations are sperm number, sperm 
morphology, and sperm motility. Data on those parameters allow more adequate estimation of the 
number of "normal" sperm; a parameter that is likely to be more informative than sperm number alone. 
Although effects on sperm production can be reflected in other measures such as testicular spermatid 
count or cauda epididymal weight, no surrogate measures are adequate to reflect effects on sperm 
morphology or motility. Similar data can be obtained noninvasively from human ejaculates, enhancing 
the ability to confirm effects seen in test species or to detect effects in humans. Brief descriptions of 
these measures are provided below, followed by a discussion of the use of various sperm measures in 
male reproductive risk assessment. 

3.2.3.4.1. Sperm number. Measures of sperm concentration (count) have been the most frequently 
reported semen variable in the literature on humans (Wyrobek et al, 1983a). Sperm number or sperm 
concentration from test species may be derived from ejaculated, epididymal, or testicular samples 
(Seed et al, 1996). Of the common test species, ejaculates can only be obtained readily from rabbits 
or dogs. Ejaculates can be recovered from the reproductive tracts of mated females of other species 
(Zenick et al, 1984). Measures of human sperm production are usually derived from ejaculates, but 
could also be obtained from spermatid counts or quantitative histology using testicular biopsy tissue 
samples. With ejaculates, both sperm concentration (number of sperm/mL of ejaculate) and total 
sperm per ejaculate (sperm concentration x volume) should be evaluated. 

Ejaculated sperm number from any species is influenced by several variables, including the 
length of abstinence and the ability to obtain the entire ejaculate. Into- and interindividual variation are 
often high, but are reduced somewhat if ejaculates were collected at regular intervals from the same 
male (Williams et al, 1990). Such a longitudinal study design has improved detection sensitivity and 
thus requires a smaller number of subjects (Wyrobek et al, 1984). In addition, if a pre-exposure 
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baseline is obtained for each male (test animal or human studies when allowed by protocol), then 
changes during exposure or recovery can be better defined. 

Epididymal sperm evaluations with test species usually use sperm from only the cauda portion 
of the epididymis, but the samples for speim motility and morphology may be derived also from the vas 
deferens. It has been customary to express the sperm count in relation to the weight of the cauda 
epididymis. However, because sperm contribute to epididymal weight, expression of the data as a ratio 
may actually mask declines in sperm number. The inclusion of data on absolute sperm counts can 
improve resolution. As is true for ejaculated sperm counts, epididymal sperm counts are influenced 
directly by level of sexual activity (Amann, 1981; Hurtt and Zenick, 1986). 

Sperm production data may be derived from counts of the distinctive elongated spermatid 
nuclei that remain after homogenization of testes in a detergent-containing medium (Amann, 1981; 
Meistrich, 1982; Cassidy et al, 1983; Blazak et al., 1993). The elongated spermatid counts are a 
measure of speim production from the stem cells and their ensuing survival through 
spermatocytogenesis and speimiogenesis (Meistrich, 1982; Meistrich and van Beek, 1993). If 
evaluation was conducted when the effect of a lesion would be reflected adequately in the spermatid 
count, then spermatid count may serve as a substitute for quantitative histologic analysis of sperm 
production (Russell et al, 1990). However, spermatid counts may be misleading when duration of 
exposure is shorter than the time required for a lesion to be fully expressed in the spermatid count. 
Also, spermatid counts reported from some laboratories have large coefficients of variation that may 
reduce the statistical power and thus the usefulness of that measure. 

The ability to detect a decrease in testicular sperm production may be enhanced if spermatid 
counts are available. However, spermatid enumerations only reflect the integrity of speimatogenic 
processes within the testes. Posttesticular effects or toxicity expressed as alterations in motility, 
morphology, viability, fragility, and other properties of sperm can be determined only from epididymal, 
vas deferens, or ejaculated samples. 

3.2.3.4.2. Sperm morphology. Sperm morphology refers to structural aspects of sperm and can be 
evaluated in cauda epididymal, vas deferens, or ejaculated samples. A thorough morphologic 
evaluation identifies abnormalities in the sperm head and flagellum. Because of the suggested 
correlation between an agent's mutagenicity and its ability to induce abnormal sperm, sperm head 
morphology has been a frequently reported sperm variable in toxicologic studies on test species 
(Wyrobek et al, 1983b). The tendency has been to conclude that increased incidence of sperm head 
malformations reflects germ-cell mutagenicity. However, not every mutagen induces sperm head 
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abnormalities, and other nonmutagenic chemicals may alter sperm head morphology. For example, 
microtubule poisons may cause increases in abnormal sperm head incidence, presumably by interfering 
with spermiogenesis, a microtubule-dependent process (Russell et al, 1981). Sperm morphology may 
be altered also due to degeneration subsequent to cell death. Thus, the link between sperm 
morphology and mutagenicity is not necessarily sensitive or specific. 

An increase in abnormal sperm morphology has been considered evidence that the agent has 
gained access to the germ cells (U.S. EPA, 1986c). Exposure of males to toxic agents may lead to 
sperm abnormalities in their progeny (Wyrobek and Bruce, 1978; Hugenholtz and Bruce, 1983; 
Morrissey et al, 1988a,b). However, transmissible germ-cell mutations might exist in the absence of 
any warning morphologic indicator such as abnormal sperm. The relationships between these 
morphologic alterations and other karyotypic changes remains uncertain (de Boer et al, 1976). 

The traditional approach to characterizing morphology in toxicologic testing has relied on 
subjective categorization of sperm head, midpiece, and tail defects in either stained preparations by 
bright field microscopy (Filler, 1993) or fixed, unstained preparations by phase contrast microscopy 
(Linder et al, 1992; Seed et al, 1996). Such an approach may be adequate for mice and rats with 
their distinctly angular head shapes. However, the observable heterogeneity of structure in human 
sperm and in nonrodent species makes it difficult for the morphologist to define clearly the limits of 
normality. More systematic, quantitative, and automated approaches have been offered that can be 
used with humans and test species (Katz et al., 1982; Wyrobek et al., 1984). Data that categorize the 
types of abnormalities observed and quantify the frequencies of their occurrences are preferred to 
estimation of overall proportion of abnormal sperm. Objective, quantitative approaches that are done 
properly should result in a higher level of confidence than more subjective measures. 

Sperm morphology profiles are relatively stable and characteristic in a normal individual (and a 
strain within a species) over time. Sperm morphology is one of the least variable sperm measures in 
normal individuals, which may enhance its use in the detection of spermatotoxic events (Zenick et al, 
1994). However, the reproductive implications of the various types of abnormal sperm morphology 
need to be delineated more fully. The majority of studies in test species and humans have suggested 
that abnormally shaped sperm may not reach the oviduct or participate in fertilization (Nestor and 
Handel, 1984; Redi et al, 1984). The implication is that the greater the number of abnormal sperm in 
the ejaculate, the greater the probability of reduced fertility. 

3.2.3.4.3. Sperm motility. The biochemical environments in the testes and epididymides are highly 
regulated to assure the proper development and maturation of the sperm and the acquisition of critical 
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functional characteristics, i.e., progressive motility and the potential to fertilize. With chemical 
exposures, perturbation of this balance may occur, producing alterations in sperm properties such as 
motility. Chemicals (e.g., epichlorohydiin) have been identified that selectively affect sperm motility and 
also reduce fertility. Studies have examined rat sperm motility as a reproductive endpoint (Morrissey et 
al, 1988a,b; Toth et al, 1989b, 1991b), and sperm motility assessments are an integral part of some 
reproductive toxicity tests (Gray et al, 1988; Morrissey et al., 1989; U.S. EPA, 1996a). 

Motility estimates may be obtained on ejaculated, vas deferens, or cauda epididymal samples. 
Standardized methods are needed because motility is influenced by a number of experimental variables, 
including abstinence interval, method of sample collection and handling, elapsed time between sampling 
and observation, the temperature at which the sample is stored and analyzed, the extent of sperm 
dilution, the nature of the dilution medium, and the microscopic chamber employed for the observations 
(Slott et al, 1991; Toth et al., 1991a; Chapin et al., 1992; Schrader et al, 1992; Weir and Rumberger, 
1995; Seedetal, 1996). 

Sperm motility can be evaluated in fresh samples under phase contrast microscopy, or sperm 
images can be recorded and stored in video or digital format and analyzed later, either manually or by 
computer-aided semen analysis (Linder et al, 1986; Boyers et al, 1989; Toth et al, 1989a; Yeung et 
al, 1992; Slott and Perreault, 1993). For manual assessments, the percentage of motile and 
progressively motile sperm can be estimated and a simple scale used to describe the vigor of the sperm 
motion. 

The recent application of video and/or digital technology to sperm analysis allows a more 
detailed evaluation of sperm motion including information about the individual sperm tracks. It also 
provides permanent storage of the sperm tracks which can be reanalyzed as necessaiy (manually or 
computer-assisted). With computer-assisted technology, information about sperm velocity (straight-line 
and curvilinear) as well as the amplitude and frequency of the tack are obtained rapidly and efficiently 
on large numbers of sperm. Using this technology, chemically induced alterations in sperm motion have 
been detected (Toth et al, 1989a, 1992; Slott et al, 1990; Klinefelter et al, 1994a), and such changes 
have been related to the fertility of the exposed animals (Toth et al, 1991a; Oberlander et al, 1994; 
Slott et al, 1995). These preliminary studies indicate that significant reductions in sperm velocity are 
associated with infertility, even when the percentage of motile sperm is not affected. The ability to 
distinguish between the proportion of sperm showing any type of motion and those with progressive 
motility is important (Seed et al, 1996). 

Changes in endpoints that measure effects on spermatogenesis and sperm maturation have been 
related to fertility in several test species, but the ability to predict infertility from these data (in the 
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absence of fertility data) is not reliable. This is in part due to the observation, in both test species and 
humans, that fertility is dependent not only on having adequate numbers of sperm, but also on the 
degree to which those sperm are normal. If sperm quality is high, then sperm number must be 
substantially reduced before fertility is affected. For example, in a rat model that employs artificial 
insemination of differing numbers of good quality sperm, sperm numbers can be reduced substantially 
before fertility is affected (Klinefelter et al., 1994b). In humans, the distribution of sperm counts for 
fertile and infertile men overlap, with the mean for fertile men being higher (Meistrich and Brown, 
1983), but fertility is likely to be impaired when counts drop below 20 million/mL (WHO, 1992). 
Similarly, if sperm numbers are noimal in rodents, a relatively large effect on sperm motility is required 
before fertility is affected. For example, rodent speim velocity must be substantially reduced, in the 
presence of adequate numbers of speim, before fertility is affected (Toth et al, 1991a; Slott et al, 
1995). These models also show that relatively modest changes in sperm numbers or quality may not 
cause infertility, but can nevertheless be predictive of infertility. On the other hand, fertility may be 
impaired by smaller decrements in both number and motility (or other qualitative characteristics). 

Thus, the process of reproductive risk assessment is facilitated by having information on a 
variety of sperm measures and reproductive organ histopathology in addition to fertility. Specific 
information about reproductive organ and gamete function can then be used to evaluate the occurrence 
and extent of injury, and the probable site of toxicity in the reproductive system. The more information 
that is available from supplementary endpoints, the more the risk assessment can be based on science 
rather than uncertainty. 

3.2.3.4.4. Adverse effects. Human male fertility is generally lower than that of test species and may 
be more susceptible to damage from toxic agents (see Supplementary Information). Therefore, the 
conservative approach should be taken that, within the limits indicated in the sections on those 
parameters, statistically significant changes in measures of speim count, morphology, or motility as well 
as number of normal sperm should be considered adverse effects. 

3.2.3.5. Paternally Mediated Effects on Offspring 

The concept is well accepted that exposure of a female to toxic chemicals during gestation or 
lactation may produce death, structural abnormalities, growth alteration, or postnatal functional deficits 
in her offspring. Sufficient data now exist with a variety of agents to conclude that male-only exposure 
also can produce deleterious effects in offspring (Davis et al., 1992; Colie, 1993; Savitz et al, 1994; 
Qiu et al, 1995). Paternally mediated effects include pre-and postimplantation loss, growth and 
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behavioral deficits, and malformations. A large proportion of the chemicals reported to cause 
paternally mediated effects have genotoxic activity, and are considered to exert this effect via 
transmissible genetic alterations. Low doses of cyclophosphamide have resulted in induction of single 
strand DNA breaks during rat spermatogenesis which, due in part to absence of subsequent DNA 
repair capability, remain at fertilization (Qiu et al, 1995). The results of such damage have been 
observed in the F 2 generation offspring (Hales et al, 1992). Other mechanisms of induction of 
paternally mediated effects are also possible. Xenobiotics present in seminal plasma or bound to the 
fertilizing sperm could be introduced into the female genital tract, or even the oocyte directly, and might 
also interfere with fertilization or early development. With humans, the possibility exists that a parent 
could transport the toxic agent from the work environment to the home (e.g., on work clothes), 
exposing other adults or children. Further work is needed to clarify the extent to which paternal 
exposures may be associated with adverse effects on offspring. Regardless, if an agent is identified in 
test species or in humans as causing a paternally mediated adverse effect on offspring, the effect should 
be considered an adverse reproductive effect. 

3.2.4. Female-Specific Endpoints 
3.2.4.1. Introduction 

The reproductive life cycle of the female may be divided into phases that include fetal, 
prepubertal, cycling adult, pregnant, lactating, and reproductively senescent. Detailed descriptions of all 
phases are available (Knobil et al, 1994). It is important to detect adverse effects occurring in any of 
these stages. Traditionally, the endpoints that have been used have emphasized ability to become 
pregnant, pregnancy outcome, and offspring survival and development. Although reproductive organ 
weights may be obtained and these organs examined histologically in test species, these measures do 
not necessarily detect abnormalities in dynamic processes such as estrous cyclicity or follicular atresia 
unless degradation is severe. Similarly, toxic effects on onset of puberty hav e not been examined, nor 
have the long-term consequences of exposure on reproductive senescence. Thus, the amount of 
information obtained routinely to detect toxic effects on the female reproductive system has been 
limited. 

The consequences of impairment in the nonpregnant female reproductive system are equally 
important, and endpoints to detect adverse effects on the nonpregnant reproductive system, when 
available, can be useful in evaluating reproductive toxicity. Such measures may also provide additional 
interrelated endpoints and information on mechanism of action. 



35 



Adverse alterations in the nonpregnant female reproductive system have been observed at dose 
levels below those that result in reduced fertility or produce other overt effects on pregnancy or 
pregnancy outcomes (Le Vier and Jankowiak, 1972; Barsotti et al, 1979; Sonawane and Yaffe, 1983; 
Cummings and Gray, 1987). In contrast to the male reproductive system, the status of the normal 
female system fluctuates in adults. Thus, in nonpregnant animals (including humans), the ovarian 
structures and other reproductive organs change throughout the estrous or menstrual cycle. Although 
not cyclic, normal changes also accompany the progression of pregnancy, lactation, and return to 
cyclicity during or after lactation. These normal fluctuations may affect the endpoints used for 
evaluation. Therefore, knowledge of the reproductive status of the female at necropsy, including the 
stage of the estrous cycle, can facilitate detection and interpretation of effects with endpoints such as 
uterine weight and histopathology of the ovary and uterus. Necropsy of all test animals at the same 
stage of the estrous cycle can reduce the variance of test results with such measures. 

A variety of measures to evaluate the integrity of the female reproductive system has been used 
in toxicity studies. With appropriate measures, a comprehensive evaluation of the reproductive process 
can be achieved, including identification of target organs and possible elucidation of the mechanisms 
involved in the agent's effects). Areas that may be examined in evaluations of the female reproductive 
system are listed in Table 5. 

Reproductive function in the female is controlled through complex interactions involving the 
central nervous system (particularly the hypothalamus), pituitary, ovaries, the reproductive tract, and the 
secondaiy sexual organs. Other nongonadotrophic components of the endocrine system may also 
modulate reproductive system function. Because it is difficult to measure certain important aspects of 
female reproductive function (e.g., increased rate of follicular atresia, ovulation failure), assessment of 
the endocrine status may provide needed insight that is not otherwise available. 

To understand the significance of effects on the reproductive endpoints, it is critical that the 
relationships between the various reproductive hormones and the female reproductive organs be 
understood. Although certain effects may be identified routinely as adverse, all of the results should be 
considered in the context of the known biology. 

The format used below for presentation of the female reproductive endpoints is altered from 
that used for the male to allow examination of events that are linked and that fluctuate with the changing 
endocrine status. Particularly, the organ weight, gross morphology, and histology are combined for 
each organ. Endpoints and endocrine factors for the individual female reproductive organs are 
discussed, with emphasis on the nonpregnant animal. This is followed by examination of measures of 
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cyclicity and their interpretation. Then, considerations relevant to prepubertal, pregnant, lactating, and 
aging females are presented. 

3.2.4.2. Body Weight, Organ Weight, Organ Morphology, and Histology 
3.2.4.2.1. Body weight. Toxicologists are often concerned about how a change in body weight may 
affect reproductive function. In females, an important consideration is that body weight fluctuates 
normally with the physiologic state of the animal because estrogen and progesterone are known to 
influence food intake and energy expenditure to an important extent (Wang, 1923; Wade, 1972). 
Water retention and fat deposition rates are also affected (Galletti and Klopper, 1964; Hervey and 
Hervey, 1967). Food consumption is elevated during pregnancy, in part 
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Table 5. Female-specific endpoints of reproductive toxicity 



Organ weights 


Ovary, uterus, vagina, pituitary 


• • • 
Visual examination 


... _ 

0\ tiiy, utci us, vagina, pituitary, oviduct, mammary 


and histopathology 


gland 


Estrous (menstrual*) 


Vaginal smear cytology 


cycle normality 




Sexual behavior 


Lordosis, time to malting, Vcignitil plugs, or sperm 


Hormone levels* 


LH, FSH, estrogen, progesterone, prolactin 


Lactation* 


Offspring growth, milk quantity and quality 


Development 


Normality of external genitalia*, vaginal opening, vaginal smear 




cytology, onset of estrous behavior (menstruation*) 


Senescence 


Vaginal smear cytology, ovarian histology (menopause*) 



*Endpoints that can be obtained relatively noninvasively with humans. 
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because of the elevated serum progesterone level. One of the most sensitive noninvasive indicators of a 
compound with estrogenic action in the female rat is a reduction in food intake and body weight. Also, 
growth retardation induced by effects on extragonadal hormones (e.g., thyroid or growth hormone) can 
cause a delay in pubertal development, and induce acyclicity and infertility. Because of these 
endocrine-related fluctuations, the weights of the reproductive organs are poorly correlated with body 
weight, except in extreme cases. Thus, actual organ weight data, rather than organ to body weight 
ratios, should be reported and evaluated for the female reproductive system. 

Chapin et al. (1993a,b) have studied the influence of food restriction on female Sprague- 
Dawley rats and Swiss CD-I mice when body weights were 90%, 80%, or 70% of controls. Female 
rats were resistant to effects on reproductive function at 80% of control weight whereas mice showed 
adverse effects at 80% and a marginal effect at 90%. These results indicate that differences exist 
between species (and probably between stains) in the response of the female rodent reproductive 
system to reduced food intake or body weight reduction. 

3.2.4.2.2. Ovary. The ovary selves a number of functions that are critical to reproductive activity, 
including production and ovulation of oocytes. Estrogen is produced by developing follicles and 
progesterone is produced by corpora lutea that are formed after ovulation. 

3.2.4.2.2.1. Ovarian weight. Significant increases or decreases in ovarian weight compared with 
controls should be considered an indication of female reproductive toxicity. Although ovarian function 
shifts throughout the estrous cycle, ovarian weight in the noimal rat does not show significant 
fluctuations. Still, oocyte and follicle depletion, persistent polycystic ovaries, inhibition of coipus luteum 
formation, luteal cyst development, reproductive aging, and altered hypothalamic-pituitary function may 
all be associated with changes in ovarian weight. Therefore, it is important that ovarian gross 
morphology and histology also be examined to allow correlation of alterations in those parameters with 
changes in ovarian weight. However, not all adverse histologic alterations in the ovary are concurrent 
with changes in ovarian weight. Therefore, a lack of effect on organ weights does not preclude the 
need for histologic evaluation. 

3.2.4.2.2.2. Histopathology. Histologic evaluation of the three major compartments of the ovary 
(i.e., follicular, luteal, and interstitial) plus the epithelial capsule and ovarian stroma may indicate ovarian 
toxicity. A number of pathologic conditions can be detected by ov arian histology (Kurman and Norris, 
1978; Langley and Fox, 1987). Methods are available to quantify the number of follicles and their 
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stages of maturation (Plowchalk et al, 1993). These techniques may be useful when a compound 
depletes the pool of primordial follicles or alters their subsequent development and recruitment during 
the events leading to ovulation. 

3.2.4.2.2.3. Adverse effects. Significant changes in the ovaries in any of the following effects should 
be considered adverse: 

Increase or decrease in ovarian weight 
Increased incidence of follicular atresia 
Decreased number of primary follicles 
Decreased number or lifespan of corpora lutea 

Evidence of abnormal folliculogenesis or luteinization, including cystic follicles, 

luteinized follicles, and failure of ovulation 

Evidence of altered puberty or premature reproductive senescence 

3.2.4.2.3. Uterus 

3.2.4.2.3.1. Uterine weight. An alteration in the weight of the uterus may be considered an 
indication of female reproductive organ toxicity. Compounds that inhibit steroidogenesis and cyclicity 
can dramatically reduce the weight of the uterus so that it appears atrophic and small. However, uterine 
weight fluctuates three- to fourfold throughout the estrous cycle, peaking at proestrus when, in response 
to increased estrogen secretion, the uterus is fluid filled and distended. This increase in uterine weight 
has been used as a basis for comparing relative potency of estrogenic compounds in bioassays (Kupfer, 
1987). As a result of the wide fluctuations in weight, uterine weights taken from cycling animals have a 
high variance, and large compound-related effects are required to demonstrate a significant effect unless 
interpreted relative to that animal's estrous cycle stage. A number of environmental compounds (e.g., 
pesticides such as methoxychlor and chlordecone, mycotoxins, polychlorinated biphenyls, alkylphenols, 
and phytoestrogens) possess varying degrees of estrogenic activity and have the potential to stimulate 
the female reproductive tract (Barlow and Sullivan, 1982; Bulger and Kupfer, 1985; Hughes, 1988). 

When pregnant or postpartum animals are examined, the numbers of implantation sites or 
implantation scars should be counted. This information, along with corpus luteum counts, can be used 
to calculate pre- and postimplantation losses. 

3.2.4.2.3.2. Histopathology. The histologic appearance of the normal uterus fluctuates with stage of 
the estrous cycle and pregnancy. The uterine endometrium is sensitive to influences of estrogens and 
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progestogens (Warren et al, 1967), and extended treatment with these compounds leads to 
hypertrophy and hyperplasia. Conversely, inhibition of ovarian activity and reduced steroid secretion 
results in endometrial hypoplasia and atrophy, as well as altered vaginal smear cytology. Effects 
induced during development may delay or prevent puberty, resulting in persistence of infantile genitalia. 

3.2.4.2.3.3. Adverse effects. Effects on the uterus that may be considered adverse include significant 
dose-related alteration of weight, as well as gross anatomic or histologic abnormalities. In particular, 
any of the following effects should be considered as adverse. 

Infantile or malformed utems or cervix 

Decreased or increased uterine weight 

Endometrial hyperplasia, hypoplasia, or aplasia 

Decreased number of implantation sites 

3.2.4.2.4. Oviducts. Typically, the oviducts are not weighed or examined histologically in tests for 
reproductive toxicity. However, information from visual and histologic examinations is of value in 
detecting morphologic anomalies. Descriptions of pathologic effects within the oviducts of animals 
other than humans are not common. Hypoplasia of otherwise well-formed oviducts and loss of cilia 
result most commonly from a lack of estrogen stimulation, and for this reason, this condition may not be 
recognized until after puberty. Hyperplasia of the oviductal epithelium results from prolonged 
estrogenic stimulation. Anomalies induced during development have also been described, including 
agenesis, segmental aplasia, and hypoplasia. 

Anatomic anomalies in the oviduct occurring in excess of control incidence should be 
considered as adverse effects. Hypoplasia or hyperplasia of the oviductal epithelium may be 
considered as an adverse effect, particularly if that result is consistent with observations in the uterine 
histology. 

3.2.4.2.5. Vagina and external genitalia 

3.2.4.2.5.1. Vaginal weight. Vaginal weight changes should parallel those seen in the uterus during 
the estrous cycle, although the magnitude of the changes is smaller. 

3.2.4.2.5.2. Histopathology. In rodents, cytologic changes in the vaginal epithelium (vaginal smear) 
may be used to identify the different stages of the estrous cycle (see Section 3.2.4.4). The vaginal 
smear partem may be useful to identify conditions that would delay or preclude fertility, or affect sexual 
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behavior. Other histologic alterations that may be observed include aplasia, hypoplasia, and 
hyperplasia of the vaginal epithelial cell lining. 

3.2.4.2.5.3. Developmental effects. Developmental abnormalities, either genetic or related to 
prenatal exposure to compounds that dismpt the endocrine balance, include agenesis, hypoplasia, and 
dysgenesis. Hypoplasia of the vagina may be concomitant with hyperplasia of the external genitalia and 
can be induced by gonadal or adrenal steroid exposure. In rodents, malpositioning of the vaginal and 
urethral ducts is common in steroid-treated females. Such developmentally induced lesions are 
irreversible. 

The sex ratio observed at birth may be affected by exposure of genotypic females in utero to 
agents that disrupt reproductive tract development. In cases of incomplete sex reversal because of 
such exposures, female rodents may appear more male-like and have an increased ano-genital distance 
(Gray and Ostby, 1995). 

At puberty, the opening of the vaginal orifice normally provides a simple and useful 
developmental marker. However, estrogenic or antiestrogenic chemicals can act directly on the vaginal 
epithelium and alter the age at which vaginal patency occurs without truly affecting puberty. 

3.2.4.2.5.4. Adverse effects. Significant effects on the vagina that may be considered adverse include 
the following: 

Increases or decreases in weight 

Infantile or malformed vagina or vulva, including masculinized vulva or increased ano- 
genital distance 
Vaginal hypoplasia or aplasia 
Altered timing of vaginal opening 
Abnormal vaginal smear cytology pattern 

3.2.4.2.6. Pituitary 

3.2.4.2.6.1. Pituitary weight. Alterations in weight of the pituitary gland should be considered an 
adverse effect. The discussion on pituitary weight and histology for males (see Section 3.2.3.2) is 
pertinent also for females. Pituitary weight increases normally with age, as well as during pregnancy and 
lactation. Changes in pituitary weight can occur also as a consequence of chemical stimulation. 
Increased pituitary weight often precedes tumor formation, particularly in response to treatment with 
estrogenic compounds. Increased pituitary size associated with estrogen treatment may be 
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accompanied by hyperprolactinemia and constant vaginal estrus. Decreased pituitary weight is less 
common but may result from decreased estrogenic stimulation (Cooper et al, 1989). 



3.2.4.2.6.2. Histopathology. In histologic evaluations with rats and mice, the relative size of cell 
types in the anterior pituitary (acidophils and basophils) has been reported to vary with the stages of the 
reproductive cycle and in pregnancy (Holmes and Ball, 1974). Therefore, the relationship of 
morphologic pattern to estrous or menstrual cycle stage or pregnancy status should be considered in 
interpreting histologic observations on the female pituitary. 

3.2.4.2.6.3. Adverse effects. A significant increase or decrease in pituitary weight should be 
considered an adverse effect. Significant histopathologic damage in the pituitary should be considered 
an adverse effect, but should be shown to involve cells that control gonadotropin or prolactin 
production to be called a reproductive effect. 

3.2.4.3. Oocyte Production 

3.2.4.3.1. Folliculogenesis. In normal females, all of the follicles (and the resident oocytes) are 
present at or soon after birth. The large majority of these follicles undergo atresia and are not ovulated. 
If the population of follicles is depleted, it cannot be replaced and the female will be rendered infertile. 
In humans, depletion of oocytes leads to premature menopause. Ovarian follicle biology and toxicology 
have been reviewed by Crisp (1992). 

In rodents, lead, mercury, cadmium, and polyaromatic hydrocarbons have all been implicated in 
the arrest of follicular growth at various stages of the life cycle (Mattison and Thomford, 1989). 
Susceptibility to oocyte toxicity varies considerably between species (Mattison and Thorgeirsson, 
1978). 

Environmental agents that affect gonadotropin-mediated ovarian steroidogenesis or follicular 
maturation can prolong the follicular phase of the estrous or menstrual cycle and cause atresia of 
follicles that would otherwise ovulate. Estrogenic as well as antiestrogenic agents can produce this 
effect. Also, normal follicular maturation is essential for noimal formation and function of the corpus 
luteum formed after ovulation (McNatty, 1979). 

3.2.4.3.2. Ovulation. Chemicals can delay or block ovulation by disrupting the ovulatory surge of 
luteinizing hormone (LH) or by interfering with the ability of the maturing follicle to respond to that 
gonadotropic signal. Examples for rats include compounds that interfere with normal central nervous 



43 



system (CNS) norepinephrine receptor stimulation such as the pesticides chlordimeform and amitraz 
(Goldman et al, 1990, 1991) and compounds that interfere with norepinephrine synthesis such as the 
fungicide thiram (Stoker et al, 1993). Compounds that increase central opioid receptor stimulation 
also decrease serum LH and inhibit ovulation in monkeys and rats (Pang et al, 1977; Smith, C.G., 
1983). Delayed ovulation can alter oocyte viability and cause trisomy and polyploidy in the conceptus 
(Fugo and Butcher, 1966; Butcher and Fugo, 1967; Butcher et al, 1969, 1975; Na et al, 1985). 
Delayed ovulation induced by exposure to the pesticide chlordimeform has also been shown to alter 
fetal development and pregnancy outcome in rats (Cooper et al, 1994). 

3.2.4.3.3. Corpus luteum. The coipus luteum arises from the ruptured follicle and secretes 
progesterone, which has an important role in the estrous or menstrual cycle. Luteal progesterone is also 
required for the maintenance of early pregnancy in most mammalian species, including humans (Csapo 
and Pulkkinen, 1978). Therefore, establishment and maintenance of normal corpora lutea are essential 
to normal reproductive function. However, with the exception of histopathologic evaluations that may 
establish only their presence or absence, these structures are not evaluated in routine testing. Additional 
research is needed to determine the importance of incorporating endpoints that examine direct effects 
on luteal function in routine toxicologic testing. 

3.2.4.3.3.1. Adverse effects. Increased rates of follicular atresia and oocyte toxicity leads to 
premature menopause in humans. Altered follicular development, ovulation failure, or altered corpus 
luteum formation and function can result in disruption of cyclicity and reduced fertility, and, in 
nonprimates, interference with normal sexual behavior. Therefore, significant increases in the rate of 
follicular atresia, evidence of oocyte toxicity, interference with ovulation, or altered corpus luteum 
formation or function should be considered adverse effects. 

3.2.4.4. Alterations in the Female Reproductive Cycle 

The pattern of events in the estrous cycle may provide a useful indicator of the normality of 
reproductive neuroendocrine and ovarian function in the nonpregnant female. It also provides a means 
to interpret hormonal, histologic, and morphologic measurements relative to stage of the cycle, and can 
be useful to monitor the status of mated females. Estrous cycle normality can be monitored in the rat 
and mouse by observing the changes in the vaginal smear cytology (Long and Evans, 1922; Cooper et 
al, 1993). To be most useful with cycling females, vaginal smear cytology should be examined daily for 
at least three normal estrous cycles prior to treatment, after onset of treatment, and before necropsy 
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(Kirnmel, G.A. et al, 1995). However, practical limitations in testing may limit the examination to the 
period before mating or necropsy. 

Daily vaginal smear data from rodents can provide useful information on (1) cycle length, (2) 
occurrence or persistence of estrus, (3) duration or persistence of diestrus, (4) incidence of 
spontaneous pseudopregnancy, (5) distinguishing pregnancy from pseudopregnancy (based on the 
number of days the smear remains leukocytic), and (6) indications of fetal death and resorption by the 
presence of blood in the smear after day 12 of gestation. The technique also can detect onset of 
reproductive senescence in rodents (LeFevre and McClintock, 1988). It is useful further to detect the 
presence of sperm in the vagina as an indication of mating. 

In nonpregnant females, repetitive occurrence of the four stages of the estrous cycle at regular, 
normal intervals suggests that neuroendocrine control of the cycle and ovarian responses to that control 
are normal. Even normal, control animals can show irregular cycles. However, a significant alteration 
compared with controls in the interval between occurrence of estrus for a treatment group is cause for 
concern. Generally, the cycle will be lengthened or the animals will become acyclic. Lengthening of the 
cycle may be a result of increased duration of either estrus or diestrus. Knowing the affected phase can 
provide direction for further investigation. 

The persistence of regular vaginal cycles after treatment does not necessarily indicate that 
ovulation occurred, because luteal tissue may form in follicles that have not ruptured. This effect has 
been observed after treatment with anti-inflammatory agents (Walker et al., 1988). However, that 
effect should be reflected in reduced fertility. Conversely, subtle alterations of cyclicity can occur at 
doses below those that alter fertility (Gray et al., 1989). 

Irregular cycles may reflect impaired ovulation. Extended vaginal estrus usually indicates that 
the female cannot spontaneously achieve the ovulatory surge of LH (Huang and Meites, 1975). A 
number of compounds have been shown to alter the characteristics of the LH surge including 
anesthetics (Nembutal), neurotransmitter receptor binding agents (Drouva et al, 1982), and the 
pesticides chlordimeform and lindane (Cooper et al., 1989; Morris et al, 1990). Persistent or constant 
vaginal cornification (or vaginal estrus) may result from one or several effects. Typically, in the adult, if 
the vaginal epithelium becomes cornified and remains so in response to toxicant exposure, it is the result 
of the agent's estrogenic properties (i.e., DES or methoxychlor), or the ability of the agent to block 
ovulation. In the latter case, the follicle persists and endogenous estrogen levels bring about the 
persistent vaginal cornification. Histologically, the ovaries in persistent estrus will be atrophied following 
exposure to estrogenic substances. In contrast, the ovaries of females in which ovulation has been 
blocked because of altered gonadotropin secretion will contain several large follicles and no corpora 
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lutea. Females in constant estrus may be sexually receptive regardless of the mechanism responsible for 
this altered ovarian condition. However, if ovulation has been blocked by the treatment, an LH surge 
may be induced by mating (Brown-Grant et al, 1973; Smith, E.R. and Davidson, 1974) and a 
pregnancy or pseudopregnancy may ensue. The fertility of such matings is reduced (Cooper et al, 
1994). Significant delays in ovulation can result in increased embryonic abnormalities and pregnancy 
loss (Fugo and Butcher, 1966; Cooper et al, 1994). 

Persistent diestrus indicates temporary or permanent cessation of follicular development and 
ovulation, and thus at least temporary infertility. Prolonged vaginal diestrus, or anestrus, may be 
indicative of agents (e.g., polyaromatic hydrocarbons) that interfere with follicular development or 
deplete the pool of primordial follicles (Mattison and Nightingale, 1980) or agents such as atrazine that 
interrupt gonadotropin support of the ovary (Cooper et al., 1996). Pseudopregnancy is another altered 
endocrine state reflected by persistent diestrus. A pseudopregnant condition also has been shown to 
result in rats following single or multiple doses of atrazine (Cooper et al., 1996). The ovaries of 
anestrous females are atrophic, with few primary follicles and an unstimulated uterus (Huang and 
Meites, 1975). Serum estradiol and progesterone are abnormally low. 

3.2.4.4.1. Adverse effects. Significant evidence that the estrous cycle (or menstrual cycle in primates) 
has been disrupted should be considered an adverse effect. Included should be evidence of abnormal 
cycle length or pattern, ovulation failure, or abnormal menstruation. 

3.2.4.5. Mammary Gland and Lactation 

The mammary glands of normal adults change dramatically during the period around parturition 
because of the sequential effects of a number of gonadal and extragonadal hormones. Milk letdown is 
dependent on the suckling stimulus and the release of oxytocin from the posterior pituitary. Thus, 
mammary tissue is highly endocrine dependent for development and function (Wolff, 1993; Imagawa et 
al, 1994; Tucker, 1994). 

Mammary gland size, milk production and release, and histology can be affected adversely by 
toxic agents, and many exogenous chemicals and drugs are transferred into milk (American Academy 
of Pediatrics Committee on Drugs, 1994; Oskarsson et al., 1995; Sonawane, 1995). Reduced growth 
of young could be caused by reduced milk availability, palatability or quality, by ingestion of a toxic 
agent secreted into the milk, or by other factors unrelated to lactational ability (e.g., deficient suckling 
ability or deficient maternal behavior). Perinatal exposure to steroid hormones and other chemicals can 
alter mammary gland morphology and tumor potential in adulthood. Because of the tendency for 
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mobilization of lipids from adipose tissue and secretion of those lipids into milk by lactating females, 
milk may contain lipophilic agents at concentrations equal to or higher than those present in the blood or 
organs of the dam. Thus, suckling offspring may be exposed to elevated levels of such agents. 

Techniques for measuring mammary tissue development, nucleic acid content, milk production 
and milk composition in rodents are discussed by Tucker (1994). During lactation, the mammary 
glands can be dissected and weighed only with difficulty. RNA content of the mammary glands may be 
measured as an index of lactational potential. More direct estimates of milk production may be 
obtained by measuring litter weights of milk-deprived pups taken before and after nursing. Milk from 
the stomachs of pups treated similarly can also be weighed at necropsy. Cleared and stained whole 
mounts of the mammary gland can be prepared at necropsy for histologic examination. The DNA, 
RNA, and lipid content of the mammary gland and the composition of the milk have been measured 
following toxicant administration as indicators of toxicity to this target organ. 

Significant reductions in milk production or negative effects on milk quality, whether measured 
directly or reflected in impaired development of young, should be considered adverse reproductive 
effects. 

3.2.4.6. Reproductive Senescence 

With advancing age, there is a loss of the regular ovarian cycles and associated normal cyclical 
changes in the uterine and vaginal epithelium that are typical of the young-adult female rat (Cooper and 
Walker, 1979). Although the mechanisms responsible for this loss of cycling are not thoroughly 
understood, age-dependent changes occur within the hypothalamic-pituitary control of ovulation 
(Cooper et al., 1980; Finch et al., 1984). Cumulative exposure to estrogen secreted by the ovary may 
play a role, as treatment with estrogens during adulthood can accelerate the age-related loss of ovarian 
function (Brawer and Finch, 1983). In contrast, the principal cause of the loss of ovarian cycling in 
humans appears to be the depletion of oocytes (Mattison, 1985). 

Prenatal or postnatal treatment of females with estrogens or estrogenic pesticides can also 
cause impaired ovulation and sterility (Gorski, 1979). These observations imply that alterations in 
ovarian function may not be noticeable immediately after treatment but may become evident at puberty 
or influence the age at which reproductive senescence occurs. 

3.2.4.6.1. Adverse effects. Significant effects on measures showing a decrease in the age of onset of 
reproductive senescence in females should be considered adverse. Cessation of normal cycling, which 
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is measured by vaginal smear cytology, ovarian histopathology, or an endocrine profile that is consistent 
with this interpretation, should be included as an adverse effect. 

3.2.5. Developmental and Pubertal Alterations 

3.2.5.1. Developmental Effects 

Alterations of reproductive differentiation and development, including those produced by 
endocrine system disruption, can result in infertility, functional and morphologic alterations of the 
reproductive system, and cancer (Steinberger and Lloyd, 1985; Gray, 1991). Prenatal and postnatal 
exposure to toxicants can produce changes that may not be predicted from effects seen in adults, and 
those effects are often irreversible. Adverse developmental outcomes in either sex can result from 
exposure to toxic agents in utero, through contact with exposed dams, or in milk. Dosing of dams 
during lactation also can result in developmental effects through impaired nursing capability of the dams. 

Effects observed in rodents following developmental exposure to agents can include alterations 
in the genitalia (including ano-genital distance), inhibited (female) or retained (male) nipple development, 
impaired sexual behavior, delay or acceleration of the onset of puberty, and reduced fertility (Gray et 
al, 1985, 1994, 1995; Gray and Ostby, 1995; Kelce et al., 1995). Effects may include altered sexual 
behavior or ability to produce gametes normally that are not observed until after puberty. Hepatic 
enzyme systems for steroid metabolism that are imprinted during development may be altered in males. 
Testis descent from the abdominal cavity into the scrotum may be delayed or may not occur. 
Generally, the type of effect seen may differ depending on the stage of development at which the 
exposure occurred. 

Many of these effects have been detected in human females and males exposed prenatally to 
diethylstilbestrol (DES), other estrogens, progestins, androgens, and anti-androgens (Giusti et al., 1995; 
Harrison et al, 1995). Accelerated reproductive aging and tumors of the reproductive tract have been 
observed in laboratory animal and human females after pre- or perinatal exposure to hormonally active 
agents. However, capability to alter sexual differentiation is not limited to agents with known direct 
hormonal activity. Other agents, for which the mode of action is not known (e.g., busulfan, nitrofen), or 
which affect the endocrine system indirectly (e.g., PCBs, dioxin), may act via different mechanisms 
during critical periods of development to alter sexual differentiation and reproductive system 
development. 

3.2.5.2. Effects on Puberty 
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In female rats and mice, the age at vaginal opening is the most commonly measured marker of 
puberty. This event results from an increase in the blood level of estradiol. The ages and weights of 
females at the first cornified (estrous) vaginal smear, the first diestrous smear, and the onset of vaginal 
cycles have also been used as endpoints for onset of puberty. In males, preputial separation or 
appearance of sperm in expressed urine or ejaculates can serve as markers of puberty. Body weight at 
puberty may provide a means to separate specific delays in puberty from those that are related to 
general delays in development. Agents may differentially affect the endpoints related to puberty onset, 
so it is useful to have information on more than one marker. 

Puberty can be accelerated or delayed by exogenous agents, and both types of effects may be 
adverse (Gray et al, 1989, 1995; Gray and Ostby, 1995; Kelce et al., 1995). For example, an 
acceleration of vaginal opening may be associated with a delay in the onset of cyclicity, infertility, and 
with accelerated reproductive aging (Gorski, 1979). Delays in pubertal development in rodents are 
usually related to delayed maturation or inhibition of function of the hypothalamic-pituitary axis. 
Adverse reproductive outcomes have been reported in rodents when puberty is altered by a week or 
more, but the biologic relevance of a change in these measures of a day or two is unknown (Gray, 
1991). 

3.2.5.3. Adverse Effects 

Effects induced or observed during the pre- or perinatal period should be judged using 
guidance from the Guidelines for Developmental Toxicity Risk Assessment (U.S. EPA, 1991) as 
well as from these Guidelines. Significant effects on ano-genital distance or age at puberty, either early 
or delayed, should be considered adverse as should malformations of the internal or external genitalia. 
Included as adverse effects for females should be effects on nipple development, age at vaginal 
opening, onset of cyclic vaginal smears, onset of estrus or menstruation, or onset of an endocrine or 
behavioral pattern consistent with estrous or menstrual cyclicity. Included as adverse effects for males 
should be delay or failure of testis descent, as well as delays in age at preputial separation or 
appearance of sperm in expressed urine or ejaculates. 

3.2.6. Endocrine Evaluations 

Toxic agents can alter endocrine system function by affecting any part of the hypothalamic- 
pituitary-gonadal-reproductive tract axis. Effects may be induced in either sex by altering hormone 
synthesis, storage, release, transport, or clearance, as well as by altering hormone receptor recognition 
or postreceptor responses. The involvement of the endocrine system in female reproductive physiology 
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and toxicology has been presented to a substantial degree as a necessary component in Section 3.2.4 
(Female-Specific Endpoints). The information in that section should be considered together with the 
following material. 

The male reproductive system can be affected adversely by disruption of the normal endocrine 
balance. In adults, effects that result in interference with normal concentrations or action of LH and/or 
follicle stimulating hormone (FSH) can decrease or abolish spermatogenesis, affect secondary sex 
organ (e.g., epididymis) and accessory sex gland (e.g., prostate, seminal vesicle) function, and impair 
sexual behavior (Sharpe, 1994). In mammals, a female reproductive tract develops unless androgen is 
produced and utilized normally by the fetus (Byskov and Hoyer, 1994; George and Wilson, 1994). 
Therefore, the consequences of disiuption of the normal endocrine pattern during development of the 
male reproductive system pre- and postnatally are of particular concern. Differentiation and 
development of the male reproductive system are especially sensitive to substances that interfere with 
the production or action of androgens (testosterone and dihydrotestosterone). Sexual differentiation of 
the CNS can be affected also. Therefore, interference with normal production or response to 
androgens can result in a range of abnormal effects in genotypic males ranging from a 
pseudohermaphrodite condition to reduction in sperm production or altered sexual behavior. 
Chemicals with estrogenic or anti-androgenic activity have been identified that are capable, with 
sufficient exposure levels, of causing effects of these types in males (Gray et al., 1994; Harrison et al., 
1995; Kelce et al, 1995). While sensitivity may differ, it is likely that mechanisms of action for these 
endocrine dismpting agents will be consistent across mammalian species. Chemicals with the ability to 
interact with the Ah receptor (e.g., dioxin or PCBs) may also disrupt reproductive system development 
or function (Brouwer et al, 1995; Safe, 1995). Several of the effects seen with exposure of male and 
female rats and hamsters differ from those caused by estrogens, indicating a different mechanism of 
action. 

The developing nervous system can be a target of chemicals. In rats, sexual differentiation of 
the CNS can be modified by hormonal treatments or exposure to environmental agents that mimic or 
interfere with the action of certain hormones. Prior to gender differentiation, the brain is inherently 
female or at least bipotential (Gorski, 1986). Thus, the functional and structural sex differences in the 
CNS are not due directly to sex differences in neuronal genomic expression, but rather are imprinted by 
the gonadal steroid environment during development. 

Chemicals with endocrine activity have been shown to masculinize the CNS of female rats. 
Examples include chlordecone (Gellert, 1978), DDT (Bulger and Kupfer, 1985), and methoxychlor 
(Gray et al, 1989). Exposure of newborn female rats to these agents during the critical period of 
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sexual differentiation can alter the timing of puberty and perturb subsequent reproductive function, 
presumably by altering the development of the neural mechanisms that regulate gonadotropin secretion. 

In females, the situation is more complex than in males due to the female cycle, the fertilization 
process, gestation and lactation. All of the functions of the female reproductive system are under 
endocrine control, and therefore can be susceptible to disruption by effects on the reproductive 
endocrine system. 

As with males, disturbance of the normal endocrine patterns during development can result in 
abnormal development of the female reproductive tract at exposure levels that tend to be lower than 
those affecting adult females (Gellert, 1978; Brouwer et al., 1995). Consistent with the differentiation 
mechanism described above, exposure of genotypic females to androgens causes formation of 
pseudohermaphrodite reproductive tracts with varying degrees of severity as well as alteration of brain 
imprinting. However, exposure to estrogenic substances during development also results in adverse 
effects on anatomy and function including, in rats, malformations of the genitalia. Exposure of human 
females to diethylstilbestrol in utero has been shown to cause an increased incidence of vaginal clear cell 
adenoma (Giusti et al., 1995). Dioxin, presumably acting through the Ah receptor, also disrupts 
development of the female reproductive system (Gray and Ostby, 1995). 

Endpoints can be included in standardized toxicity testing that are capable of detecting, but are 
not specific for, effects of reproductive endocrine system disruption. For effects of exposure on adults, 
endpoints can be incorporated into the subchronic toxicity protocol or into reproductive toxicity 
protocols. For effects that are induced during development, protocols that include exposure throughout 
the development process and allow evaluation of the offspring postpubertally are needed. Data from 
specialized testing, including in vitro screening tests, may be useful to evaluate further the site, timing, 
and mechanism of action. 

Endpoints that can detect endocrine-related effects with adult-only exposure in standardized 
testing include evaluation of fertility, reproductive organ appearance, weights, and histopathology, 
oocyte number, cycle normality and mating behavior. Endpoints that can detect effects induced by 
endocrine system disruption during development include, in addition to those identified for adult- 
exposed animals, the reproductive developmental endpoints identified in Section 3.2.5. Significant 
effects on any of these measures may be considered to be adverse if the results are consistent and 
biologically plausible. 

Levels of the reproductive hormones are not available routinely from toxicity testing. However, 
measurements of the reproductive hormones in males offer useful supplemental information in assessing 
potential reproductive toxicity for test species (Sever and Hessol, 1984; Heywood and James, 1985; 
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NRC, 1989). Such measurements have increased importance with humans where invasiveness of 
approaches must be limited. The reproductive hormones measured often are circulating levels of LH, 
FSH, and testosterone. Other useful measures that may be available include prolactin, inhibin, and 
androgen binding protein levels. In addition, challenge tests with exogenous agents (e.g., gonadotropin 
releasing hormone, LH, or human chorionic gonadotropin) may provide insight into the functional 
responsiveness of the pituitary or Leydig cells. 

Interpretation of endocrine effects is facilitated if information is available on a battery of 
hormones. However, in evaluating such data, it is important to consider that serum hormones such as 
FSH, LH, prolactin, and androgens exhibit cyclic variations within a 24-hour period (Fink, 1988). 
Thus, the time of sampling should be controlled rigorously to avoid excessive variability (Nett, 1989). 
Sequential sampling can allow detection of treatment-related changes in circadian and pulsatile rhythms. 

The partem seen in levels of reproductive system hormones can provide useful information 
about the possible site and type of effect on reproductive system function. For example, if a compound 
acts at the level of the hypothalamus or pituitaiy, then serum LH and FSH may be decreased, leading to 
decreased testosterone levels. On the other hand, severe interference with Sertoli cell function or 
spermatogenesis would be expected to elevate serum FSH levels. An agent having antiandrogenic 
activity in adults might elevate semm LH and testosterone. Testis weight might be unaffected, while the 
weight and size of the accessory sex glands may be reduced. The endocrine profile presented by 
exposure to specific antiandrogens can differ markedly because of differences in tissue specificity and 
receptor kinetics, as well as age at which exposure occurred. 

3.2.6.1. Adverse Effects 

In the absence of endocrine data, significant effects on reproductive system anatomy, sexual 
behavior, pituitary, uterine or accessory sex gland weights or histopathology, female cycle normality, or 
Leydig cell histopathology may suggest disruption of the endocrine system. In those instances, 
additional testing for endocrine effects may be indicated. Significant alterations in circulating levels of 
estrogen, progesterone, testosterone, prolactin, LH, or FSH may be indicative of existing pituitary or 
gonadal injury. When significant alterations from control levels are observed in those hormones, the 
changes should be considered cause for concern because they are likely to affect, occur in concert 
with, or result from alterations in gametogenesis, gamete maturation, mating ability, or fertility. Such 
effects, if compatible with other available information, may be considered adverse and may be used to 
establish a NOAEL, LOAEL, or benchmark dose. Furthermore, endocrine data may facilitate 
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identification of sites or mechanisms of toxicant action, especially when obtained after short-term 
exposures. 

3.2.7. In Vitro Tests of Reproductive Function 

Numerous in vitro tests are available and under development to measure or detect chemically 
induced changes in various aspects of both male and female reproductive systems (Kimmel, G.L. et al, 
1995). These include in vitro fertilization using isolated gametes, whole organ (e.g., testis, ovary) 
perfusion, culture of isolated cells from the reproductive organs (e.g., Leydig cells, Sertoli cells, 
granulosa cells, oviductal or epididymal epithelium), co-culture of several populations of isolated cells, 
ovaries, quarter testes, seminiferous tubule segments, various receptor binding assays on reproductive 
cells and transfected cell lines, and others. 

Tests of sperm properties and function that have been applied to reproductive toxicology 
include penetration of speim through v iscous medium (Yeung et al, 1992), in vitro capacitation and 
fertilization assays (Holloway et al., 1990a,b; Perreault and Jeffay, 1993; Slott et al, 1995), and 
evaluation of speim nuclear integrity (Damey, 1991). In addition, evaluation of human sperm function 
may include sperm penetration of cervical mucus, ability of sperm to undergo an acrosome reaction, 
and ability to penetrate zona pellucida-free hamster oocytes or bind to human hemi-zona pellucidae 
(Franken et al, 1990; Liu and Baker, 1992). 

The diagnostic information obtained from such tests may help to identify potential effects on the 
reproductive systems. However, each test bypasses essential components of the intact animal system 
and therefore, by itself, is not capable of predicting exposure levels that would result in toxicity in intact 
animals. While it is desirable to replace whole animal testing to the extent possible with in vitro tests, 
the use of such tests currently is to screen for toxicity potential and to study mechanisms of action and 
metabolism (Perreault, 1989; Holloway et al, 1990a,b). 

3.3. HUMAN STUDIES 

In principle, human data are scientifically preferable for risk assessment since test animal to 
human extrapolation is not required. At this time, reproductive data for humans are available for only a 
limited number of toxicants. Many of these are from occupational settings in which exposures tend to 
be higher than in environmental settings. As more data become available, expanding the number of 
agents and endpoints studied and improving exposure assessment, more risk assessments will include 
these data. The following describes the methods of generation and evaluation of human data and the 
relative weight the various types of human data should be given in risk assessments. 
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"Human studies" include both epidemiologic studies and other reports of individual cases or 
clusters of events. Typical epidemiologic studies include (1) cohort studies in which groups are defined 
by exposure and health outcomes are examined; (2) case-referent studies in which groups are defined 
by health status and prior exposures are examined; (3) cross-sectional studies in which exposure and 
outcome are determined at the same time; and 4) ecologjc studies in which exposure is presumed based 
typically on residence. Greatest weight should be given to carefully designed epidemiologic studies with 
more precise measures of exposure, because they can best evaluate exposure-response relationships. 
This assumes that human exposures occur in broad enough ranges for observable differences in 
response to occur. Epidemiologic studies in which exposure is presumed, based on occupational title 
or residence (e.g., some case-referent and all ecologic studies), may contribute data for hazard 
characterization, but are of limited use for quantitative risk determination because of the generally broad 
categorical groupings of exposure. Reports of individual cases or clusters of events may generate 
hypotheses of exposure-outcome associations, but require further confirmation with well-designed 
epidemiologic or laboratory studies. These reports of cases or clusters may support associations 
suggested by other human or test animal data, but cannot stand by themselves in risk assessments. 

3.3.1. Epidemiologic Studies 

Good epidemiologic studies provide valuable data for assessment of human risk. As there are 
many different designs for epidemiologic studies, simple rules for their evaluation do not exist. Risk 
assessors should seek the assistance of professionals trained in epidemiology when conducting a 
detailed analysis. The following is an overview of key issues to consider in evaluation for risk 
assessment of reproductive effects. 

3.3.1.1. Selection of Outcomes for Study 

As already discussed, a number of endpoints can be considered in the evaluation of adverse 
reproductive effects. However, some of the outcomes are not easily observed in humans, such as early 
embryonic loss, reproductive capacity of the offspring, and invasive evaluations of reproductive function 
(e.g., testicular biopsies). Currently, the most feasible endpoints for epidemiologic studies are (1) 
indirect measures of fertility/infertility; (2) reproductive history studies of some pregnancy outcomes 
(e.g., embryonic/fetal loss, birth weight, sex ratio, congenital malformations, postnatal function, and 
neonatal growth and survival); (3) semen evaluations; (4) menstrual history; and (5) blood or urinary 
hormone measures. Factors requiring control in the design or analysis (such as effect modifiers and 
confounders, described below) may vary depending on the specific outcomes selected for study. 
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The reproductive outcomes available for epidemiologic examination are limited by a number of 
factors, including the relative magnitude of the exposure, the size and demographic characteristics of the 
population, and the ability to observe the outcome in humans. Use of improved methods for identifying 
some outcomes, such as embryonic loss detected by more sensitive urinary hCG (human chorionic 
gonadotropin) assays, change the spectrum of outcomes available for study (Wilcox et al, 1985; 
Sweeney et al, 1988; Zinaman et al, 1996). Other, less accessible, endpoints may require invasive 
techniques to obtain samples (e.g., histopathology) or may have high intra- or interindividual variability 
(e.g., serum hormone levels, sperm count). 

Demographic characteristics of the population, such as marital status, age, education, 
socioeconomic status (SES), and prior reproductive history are associated with the probability of 
whether couples will attempt to have children. Differences in birth control practices would also affect 
the number of outcomes available for study. 

In addition to the above-mentioned factors, reproductive endpoints may be envisioned as 
effects recognized at various points in a continuum starting before conception and continuing through 
death of the progeny. Many studies, however, are limited to evaluating endpoints at a particular time in 
this continuum. For example, in a study of defects observed at live birth, a malformed stillbirth would 
not be included, even though the etiology could be identical (Bloom, 1981). Also, a different spectrum 
of outcomes could result from differences in timing or in level of exposure (Selevan and Lemasters, 
1987). 

3.3.1.1.1. Human reproductive endpoints. The following section discusses various human male and 
female reproductive endpoints. These outcomes may be an indicator of sub- or infertility. These are 
followed by a discussion of reproductive history studies. 

3.3.1.1.1.1. Male endpoints - semen evaluations. The use of semen analysis was discussed in 
Section 3.2.3.4. Most epidemiologic studies of potential effects of agents on semen characteristics 
have been conducted in occupational groups and patients receiving drug therapy. Obtaining a high level 
of participation in the workforce has been difficult, because social and cultural attitudes concerning sex 
and reproduction may affect cooperation of the study groups. Increased participation may occur in 
men who are planning to have children or who are concerned about existing reproductive problems or 
possible ill effects of their exposures. Unless controlled, such biased participation may yield 
unrepresentative estimates of risk associated with exposure, resulting in data that are less useful for risk 
assessment. While some studies have response rates greater than 70% (Ratcliffe et al, 1987; Welch et 
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al, 1988), response rates are often less than 70% in such studies and may be even lower in the 
comparison group (Egnatz et al, 1980; Lipshultz et al., 1980; Milby and Whorton, 1980; Lantz et al, 
1981; Meyer, 1981; Milby et al, 1981; Rosenberg et al, 1985; Ratcliffe et al, 1989). Some of the 
low response rates may be caused by inclusion of vasectomized men in the total population, although 
this could vary widely by population (Milby and Whorton, 1980). Participation in the comparison 
group may be biased toward those with preexisting reproductive problems. The response rate may be 
improved substantially with proper education and payment of subjects (Ratcliffe et al, 1986, 1987). 

Several factors may influence the semen evaluation, including the period of abstinence 
preceding collection of the sample, health status, and social habits (e.g., alcohol, recreational drugs, 
smoking). Data on these factors may be collected by interview, subject to the limitations described for 
pregnancy outcome studies. 

Reports of studies with semen analyses have rarely included an evaluation of endocrine status 
(hormone levels in blood or urine) of exposed males (Lantz et al, 1981; Ratcliffe et al, 1989). 
Conversely, studies that have examined endocrine status typically do not have data on semen quality 
(Mason, 1990; McGregor and Mason, 1991; Egeland et al, 1994). 

3.3.1.1.1.2. Female endpoints. Reproductive effects may result from a variety of exposures. For 
example, environmental exposures may be toxic to the oocyte, producing a loss of primary oocytes that 
irreversibly affects the woman's fecundity. The exposures of importance may occur during the prenatal 
period, and beyond. Oocyte depletion is difficult to examine directly in women because of the 
invasiveness of the tests required; however, it can be studied indirectly through evaluation of the age at 
reproductive senescence (menopause) (Everson et al, 1986). 

Numerous diagnostic methods have been developed to evaluate female reproductive 
dysfunction. Although these methods have been used rarely for occupational or environmental 
toxicologic evaluations, they may be helpful in defining biologic parameters and the mechanisms related 
to female reproductive toxicity. If clinical observations are able to link exposures to the reproductive 
effect of concern, these data will aid the assessment of adverse female reproductive toxicity. The 
following clinical observations include endpoints that may be reported in case reports or epidemiologic 
research studies. 

Reproductive dysfunction also can be studied by the evaluation of irregularities of menstrual 
cycles. However, menstrual cyclicity is affected by many parameters such as age, nutritional status, 
stress, exercise level, certain drugs, and the use of contraceptiv e measures that alter endocrine 
feedback. Vaginal bleeding at menstruation is a reflection of withdrawal of steroidogenic support, 



56 



particularly progesterone. Vaginal bleeding can occur at midcycle, in early miscarriage, after 
withdrawal of contraceptive steroids, or after an inadequate luteal phase. The length of the menstrual 
cycle, particularly the follicular phase (before ovulation), can vary between individuals and may make it 
difficult to determine significant effects on length in populations of women (Burch et al, 1967; Treloar et 
al, 1967). Human vaginal cytology may provide information on the functional state of reproductive 
cycles. Cytologic evaluations, along with the evaluation of changes in cervical mucus viscosity, can be 
used to estimate the occurrence of ovulation and determine different stages of the reproductive cycle 
(Kesner et al, 1992). Menstrual dysfunction data have been used to examine adverse reproductive 
effects in women exposed to potentially toxic agents occupationally (Lemasters, 1992), 

Reports of prospective clinical evaluations of menstrual function (Kesner et al, 1992; Wright et 
al, 1992), have shown urinary endocrine measures to be practical and useful. The endocrine status of 
a woman can be evaluated by the measurement of hormones in blood and urine. Progesterone can also 
be measured in saliva. Because the female reproductive endocrine milieu changes in a cyclic pattern, 
single sample analysis does not provide adequate information for evaluating alterations in reproductive 
function. Still, a single sample for progesterone determination some 7 to 9 days after the estimated 
midcycle surge of gonadotropins in a regularly cycling woman may provide suggestive evidence for the 
presence of a functioning coipus luteum and prior follicular maturation and ovulation. Clinically 
abnormal levels of gonadotropins, steroids, or other biochemical parameters may be detected from a 
single sample. However, a much stronger design involves collection of multiple samples and their 
observation in conjunction with events in the menstrual cycle. 

The day of ovulation can be estimated by the biphasic shift in basal body temperature. 
Ovulation can also be detected by serial measurement of hormones in the blood or urine and analyses 
of estradiol and gonadotropin status at midcycle. After ovulation, luteal phase function can be assessed 
by analysis of progesterone secretion and by evaluation of endometrial histology. Tubal patency, which 
could be affected by abnormal development, endometriosis or infection, is an endpoint that can be 
observed in clinical evaluations of reproductive function (Forsberg, 1981). These latter evaluations of 
endometrial histology and tubal patency are less likely to be present in epidemiologic studies or 
surveillance programs because of the invasiveness of the procedures. 

3.3.1.2. Reproductive History Studies 

3.3.1.2.1. Measures of fertility. Subfertility may be thought of as nonevents: a couple is unable to 
have children within a specific time frame. Therefore, the epidemiologic measurement of reduced 
fertility or fecundity is typically indirect and is accomplished by comparing birth rates or time intervals 
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between births or pregnancies. These outcomes have been examined using several methods: the 
Standardized Birth Ratio (SBR; also referred to as the Standardized Fertility Ratio) and the length of 
time to pregnancy or birth. In these evaluations, the couple's joint ability to procreate is estimated. The 
SBR compares the number of births observed to those expected based on the person-years of 
observation preferably stratified by factors such as time period, age, race, marital status, parity, and (if 
possible) contraceptive use (Wong et al, 1979; Levine et al, 1980, 1981, 1983; Levine, 1983; Starr 
et al, 1986). The SBR is analogous to the Standardized Mortality Ratio (SMR), a measure frequently 
used in studies of occupational cohorts and has similar limitations in interpretation (Gaffey, 1976; 
McMichael, 1976; Tsai and Wen, 1986). The SBR was found to be less sensitive in identifying an 
effect when compared to semen analyses (Welch et al, 1991). These data can also be analyzed using 
Poisson regression. 

Analysis of the time between recognized pregnancies or live births is a more recent approach to 
indirect measurement of fertility (Dobbins et al, 1978; Baird and Wilcox, 1985; Baird et al, 1986; 
Weinberg and Gladen, 1986; Rowland et al, 1992). Because the time between births increases with 
increasing parity (Leridon, 1977), comparisons within birth order (parity) are more appropriate. A 
statistical method (Cox regression) can stratify by birth or pregnancy order to help control for 
nonindependence of these events in the same woman or couple. 

Fertility may also be affected by alterations in sexual behavior. However, data linking toxic 
exposures to these alterations in humans are limited and are not obtained easily in epidemiology studies 
(see Section 3.3.1.4). 

3.3.1.2.2. Developmental outcomes. Developmental outcomes examined in human studies of 
parental exposures may include embryo or fetal loss, congenital malformations, birth weight effects, sex 
ratio at birth, and possibly postnatal effects (e.g., physical growth and development, organ or system 
function, and behavioral effects of exposure). Developmental effects are discussed in more detail in the 
Guidelines for Developmental Toxicity Risk Assessment (U.S. EPA, 1991). As mentioned above, 
epidemiologic studies that focus on only one type of developmental outcome or exposures to only one 
parent may miss a true effect of exposure. 

Evidence of a dose-response relationship is usually an important criterion in the assessment of 
exposure to a potentially toxic agent. However, traditional dose-response relationships may not always 
be observed for some endpoints (Wilson, 1973; Selevan and Lemasters, 1987). For example, with 
increasing dose, a pregnancy might end in embryo or fetal loss, rather than a live birth with 
malformations. A shift in the patterns of outcomes could result from differences either in level of 
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exposure or in timing (Wilson, 1973; Selevan and Lemasters, 1987) (for a more detailed description, 
see Section 3.3.1.4). Therefore, a risk assessment should, when possible, attempt to look at the 
relationship of different reproductive endpoints and patterns of exposure. 

In addition to the above effects, exposure may produce genetic damage to germ cells. 
Outcomes resulting from germ-cell mutations could include reduced probability of fertilization and 
increased probability of embiyo or fetal loss and postnatal developmental effects. Based on studies 
with test species, germ cells or early zygotes are critical targets of potentially toxic agents. Germ-cell 
mutagenicity could be expressed also as genetic diseases in future generations. Unfortunately, these 
studies are difficult to conduct in human populations because of the long time between exposure and 
outcome and the large study groups needed. For more information and guidance on the evaluation of 
these data, refer to the Guidelines for Mutagenicity Risk Assessment (U.S. EPA, 1986c). 

3.3.1.3. Community Studies and Surveillance Programs 

Epidemiologic studies may be based on broad populations such as a community, a nationwide 
probability sample, or surveillance programs (such as birth defects registries). Some studies have 
examined the effects of environmental exposures such as potential toxic agents in outdoor air, food, 
water, and soil. These studies may assume certain exposures through these routes due to residence 
(ecologic studies). The link between environmental measurements and critical periods of exposure for a 
given reproductive effect may be difficult to make. Other studies may go into more detail, evaluating 
the above routes and also indoor air, house dust, and occupational exposures on an individual basis 
(Selevan, 1991). Such environmental studies, relating individual exposures to health outcomes should 
have less misclassification of exposure. 

Exposure definition in community studies has some limitations in the assessment of exposure- 
effect relationships. For example, in many community-based studies, it may not be possible to 
distinguish maternally mediated effects from paternally mediated effects since both parents spend time in 
the same home environment. In addition, the presumably lower exposure levels (compared with 
industrial settings) may require very large groups for the study. A number of case-referent studies have 
examined the relationship between broad classes of parental occupation in certain communities or 
countries and embryo/fetal loss (Silverman et al., 1985; McDonald et al, 1989; Lindbohm et al, 
1991), birth defects (Hemminki et al, 1980; Kwa and Fine, 1980; Papier, 1985), and childhood 
cancer (Fabia and Thuy, 1974; Hemminki et al, 1981; Peters et al, 1981; Gardner et al, 1990a,b). 
In these reports, jobs are classified typically into broad categories based on the probability of exposure 
to certain classes or levels of exposure. Such studies are most helpful in the identification of topics for 
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additional study. However, because of the broad groupings of types or levels of exposure, these 
studies are not typically useful for risk assessment of any one particular agent. 

Surveillance programs may also exist in occupational settings. In this case, reproductive 
histories (including menstrual cycles) or semen evaluations could be followed to monitor reproductive 
effects of exposures. With adequate exposure information, these could yield very useful data for risk 
assessment. Reproductive histories tend to be easier and less costly to collect, whereas, a semen 
evaluation program would be rather costly. Success with such programs in the workplace will be 
determined by the confidence the worker has that reproductive data are kept confidential and will not 
affect employment status (Samuels, 1988; Lemasters and Selevan, 1993). 

3.3.1.4. Identification of Important Exposures for Reproductive Effects 

For all examinations of the relationship between reproductive effects and potentially toxic 
exposures, defining the exposure that produces the effect is crucial. Preconceptional exposures of 
either parent and in utero exposures have been associated with the more commonly examined 
outcomes (e.g., fetal loss, malformations, low birth weight, and measures of in- or subfacility). These 
exposures, plus postnatal exposure via breast milk, food, and the environment, may also be associated 
with postnatal developmental effects (e.g., changes in growth or in behavioral and cognitive function). 

A number of factors affect the intensity and duration of exposure. General environmental 
exposures are typically lower than those found in industrial or agricultural settings. However, this 
relationship may change as exposures are reduced in workplaces and as more is learned about 
environmental exposures (e.g., indoor air exposures, home pesticide usage). Larger populations are 
necessary to achieve sufficient power in settings with lower exposures which are likely to have lower 
measures of risk (Lemasters and Selevan, 1984). In addition, exposure to individuals may change as 
they move in and out of areas with differing levels and types of exposures, thus affecting the number of 
exposed and comparison events for study. 

Data on exposure from human studies are frequently qualitative, such as employment or 
residence histories. More quantitative data may be difficult to obtain because of the nature of certain 
study designs (e.g., retrospective studies) and limitations in estimates of historic exposures. Many 
reproductive effects result from exposures during certain critical times. The appropriate exposure 
classification depends on the outcomes studied, the biologic mechanism affected by exposure, and the 
biologic half-life of the agent. The half-life, in combination with the patterns of exposure (e.g., 
continuous or intermittent) affects the individual's body burden and consequently the actual dose during 
the critical period. The probability of misclassification of exposure status may affect the ability to 
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recognize a true effect in a study (Selevan, 1981; Hogue, 1984; Lemasters and Selevan, 1984; Sever 
and Hessol, 1984; Kimmel, C.A. et al, 1986). As more prospective studies are done, better estimates 
of exposure should be developed. 

3.3.1.5. General Design Considerations 

The factors that enhance a study and thus increase its usefulness for risk assessment have been 
noted in a number of publications (Selevan, 1980; Bloom, 1981; Hatch and Kline, 1981; Wilcox, 
1983; Sever and Hessol, 1984; Axelson, 1985; Tilley et al, 1985; Kimmel, CA. et al, 1986; Savitz 
and Harlow, 1991). Some of the more prominent factors are discussed below. 

3.3.1.5.1. The power of the study. The power, or ability of a study to detect a true effect, is 
dependent on the size of the study group, the frequency of the outcome in the general population, and 
the level of excess risk to be identified. In a cohort study, common outcomes, such as recognized fetal 
loss, require hundreds of pregnancies to have a high probability of detecting a modest increase in risk 
(e.g., 133 pregnancies in both exposed and unexposed groups to detect a twofold increase; alpha = 0.05, 
power = 80%), while less common outcomes, such as the total of all malformations recognized at birth, 
require thousands of pregnancies to have the same probability (e.g., more than 1,200 pregnancies in 
both exposed and unexposed groups) (Bloom, 1981; Selevan, 1981, 1985; Sever and Hessol, 1984; 
Stein, Z. et al, 1985; Kimmel, C.A. et al, 1986). Semen evaluation may require fewer subjects 
depending on the sperm parameters evaluated, especially when each man is used as his own control 
(Wyrobek, 1982, 1984). In case-referent studies, study sizes are dependent upon the frequency of 
exposure within the source population. The confidence one has in the results of a study showing no 
effect is related directly to the power of the study to detect meaningful differences in the endpoints. 

Power may be enhanced by combining populations from several studies using a meta-analysis 
(Greenland, 1987). The combined analysis could increase confidence in the absence of risk for agents 
showing no effect. However, caution must be exercised in the combination of potentially dissimilar 
study groups. 

Results of a negative study should be carefully evaluated, examining the power of the study and 
the degree of concordance or discordance between that study and other studies (including careful 
examination of comparability in the details such as similarity of adverse endpoints and study design). 
The consistency among results of different studies could be evaluated by comparing statistical 
confidence intervals for the effects found in different studies. Studies with lower power will tend to yield 
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wider confidence intervals. If the confidence intervals from a negative study and a positive study 
overlap, then there may be no conflict between the results of the two studies. 

3.3.1.5.2. Potential bias in data collection. Bias may result from the way the study group is 
selected or information is collected (Rothman, 1986). Selection bias may occur when an individual's 
willingness to participate varies with certain characteristics relating to exposure or health status. In 
addition, selection bias may operate in the identification of subjects for study. For example, in studies 
of very early pregnancy loss, use of hospital records to identify the study group will under-ascertain 
events, because women are not always hospitalized for these outcomes. More weight would be given 
in a risk assessment to a study in which a more complete list of pregnancies is obtained by, for example, 
collecting biologic data (e.g., human chorionic gonadotropin [hCG] measurements) of pregnancy status 
from study members. The representativeness of these data may be affected by selection factors related 
to the willingness of different groups of women to continue participation over the total length of the 
study. Interview data result in more complete ascertainment than hospital records; however this 
strategy carries with it the potential for recall bias, discussed in further detail below. Other examples of 
different levels of ascertainment of events include: (1) use of hospital records to study congenital 
malformations since hospital records contain more complete data on malformations than do birth 
certificates (Mackeprang et al, 1972; Snell et al, 1992) and (2) use of sperm bank or fertility clinic 
data for semen studies. Semen data from either source are selected data because semen donors are 
typically of proven fertility, and men in fertility clinics are part of a subfertile couple who are actively 
trying to conceive. Thus, studies using the different record sources to identify reproductive outcomes 
need to be evaluated for ascertainment patterns prior to use in risk assessment. 

Studies of women who work outside the home present the potential for additional bias because 
some factors that influence employment status may also affect reproductive endpoints. For example, 
because of child-care responsibilities, women may terminate employment, as might women with a 
history of reproductive problems who wish to have children and are concerned about workplace 
exposures (Joffe, 1985; Lemasters and Pinney, 1989). Thus, retrospective studies of female exposure 
that do not include terminated women workers may be of limited use in risk assessment because the 
level of risk for these outcomes is likely to be overestimated (Lemasters and Pinney, 1989). 

Information bias may result from misclassification of characteristics of individuals or events 
identified for study. Recall bias, one type of information bias, may occur when respondents with 
specific exposures or outcomes recall information differently than those without the exposures or 
outcomes. Interview bias may result when the interviewer knows a priori the category of exposure 
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(for cohort studies) or outcome (for case-referent studies) in which the respondent belongs. Use of 
highly structured questionnaires and/or "blinding" of the interviewer reduces the likelihood of such bias. 
Studies with lower likelihood of such bias should carry more weight in a risk assessment. 

When data are collected by interview or questionnaire, the appropriate respondent depends on 
the type of data or study. For example, a comparison of husband- wife interviews on reproduction 
found the wives' responses to questions on pregnancy-related events to be more complete and valid 
than those of the husbands, and the individual's self-report of his/her occupational exposures and health 
characteristics more reliable than his/her mate's report (Selevan, 1980; Selevan et al, 1982). Studies 
based on interview data from the appropriate respondents would carry more weight than those from 
proxy respondents. 

Data from any source may be prone to errors or bias. All types of bias are difficult to assess; 
however, validation with an independent data source (e.g., vital or hospital records), or use of 
biomarkers of exposure or outcome, where possible, may suggest the degree of bias present and 
increase confidence in the results of the study. Those studies with a low probability of biased data 
should cany more weight (Axelson, 1985; Stein, A. and Hatch, 1987; Weinberg et al., 1994). 

Differential misclassification (i.e., when certain subgroups are more likely to have misclassified 
data than others) may either raise or lower the risk estimate. Nondifferential misclassification will bias 
the results toward a finding of "no effect" (Rothman, 1986). 

3.3.1.5.3. Collection of data on other risk factors, effect modifiers, and confounders. Risk 
factors for reproductive toxicity include such characteristics as age, smoking, alcohol or caffeine 
consumption, drug use, and past reproductive histoiy. Groups of individuals may represent susceptible 
subpopulations based on genetic, acquired (e.g., behavioral), or developmental characteristics (e.g., 
greater effect of childhood exposures). Known and potential risk factors should be examined to 
identify those that may be confounders or effect modifiers. An effect modifier is a factor that produces 
different exposure-response relationships at different levels of that factor. For example, age would be 
an effect modifier if the risk associated with a given exposure changed with age (e.g., if older men had 
semen changes with exposure while younger ones did not). A confounder is a variable that is a risk 
factor for the outcome under study and is associated with the exposure under study, but is not a 
consequence of the exposure. A confounder may distort both the magnitude and direction of the 
measure of association between the exposure of interest and the outcome. For example, smoking might 
be a confounder in a study of the association of socioeconomic status and fertility because smoking may 
be associated with both. 
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Both effect modifiers and confounders need to be controlled in the study design and/or analysis 
to improve the estimate of the effects of exposure (Kleinbaum et al, 1982). A more in-depth 
discussion may be found elsewhere (Epidemiology Workgroup for the Interagency Regulatory Liaison 
Group, 1981; Kleinbaum et al, 1982; Rothman, 1986). The statistical techniques used to control for 
these factors require careful consideration in their application and interpretation (Kleinbaum et al, 
1982; Rothman, 1986). Studies that fail to account for these important factors should be given less 
weight in a risk assessment. 

3.3.1.5.4. Statistical factors. As in studies of test animals, pregnancies experienced by the same 
woman are not fully independent events. For example, women who have had fetal loss are reported to 
be more likely to have subsequent losses (Leridon, 1977). In test animal studies, the litter can be used 
as the unit of measure to deal with nonindependence of response within the litter. In studies of humans, 
pregnancies are sequential, requiring analyses which consider nonindependence of events 
(Epidemiology Workgroup for the Interagency Regulatory Liaison Group, 1981; Kissling, 1981; 
Selevan, 1981; Zeger and Liang, 1986). If more than one pregnancy per woman is included, as is 
often necessary with small study groups, the use of nonindependent observations overestimates the true 
size of the groups being compared, thus artificially increasing the probability of reaching statistical 
significance (Stiratelli et al, 1984). Analysis problems may occur when (1) prior adverse outcomes are 
due to the same exposures or (2) when prior adverse outcomes could result in changes in behaviors 
that could reduce exposures. Some approaches to deal with these issues have been suggested 
(Kissling, 1981; Stiratelli et al., 1984; Selevan, 1985; Zeger and Liang, 1986). These approaches 
include selecting one pregnancy per family (Selevan, 1985) or using generalized estimating equations 
(Zeger and Liang, 1986). 

3.3.2. Examination of Clusters, Case Reports, or Series 

The identification of cases or clusters of adverse reproductive effects is generally limited to 
those identified by the individuals involved or clinically by their physicians. The likelihood of 
identification varies with the gender of the exposed person. Identification of subfecundity in either 
gender is difficult. This might be thought of as identification of a nonevent (e.g., lack of pregnancies or 
children), and thus is much harder to recognize than are some developmental effects, including 
malformations, resulting from in utero exposure. 

The identification of cases or clusters of adverse male reproductive outcomes may be limited 
because of cultural norms that may inhibit the reporting of impaired fecundity in men. Identification is 
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also limited by the decreased likelihood of recognizing adverse developmental effects in their offspring 
as resulting from paternal exposure rather than maternal exposure. Thus far, only one agent causing 
human male reproductive toxicity, dibromochloropropane (DBCP), has been identified after 
observation of a cluster of infertility that resulted from male subfecundity. This cluster was identified 
because of an atypically high level of communication among the workers' wives (Whorton et al, 1977, 
1979; Biava et al, 1978; Whorton and Milby, 1980). 

Adverse effects identified in females through clusters and case reports have, thus far, been 
limited to adverse pregnancy outcomes such as fetal loss and congenital malformations. Identification of 
other effects, such as subfertility/subfecundity or menstrual cycle disorders, may be more difficult, as 
noted above. 

Case reports may have importance in the recognition of agents that cause reproductive toxicity. 
However, they are probably of greatest use in suggesting topics for further investigation. Reports of 
clusters and case reports/series are best used in risk assessment in conjunction with strong laboratory 
data to suggest that effects observed in test animals also occur in humans. 

3.4. PHARMACOKINETIC CONSIDERATIONS 

Extrapolation of toxicity data between species can be aided considerably by the availability of 
data on the pharmacokinetics of a particular agent in the species tested and, when available, in humans. 
Information on absorption, half-life, steady-state or peak plasma concentrations, placental metabolism 
and transfer, comparative metabolism, and concentrations of the parent compound and metabolites in 
target organs may be useful in predicting risk for reproductive toxicity. Information on the variability 
between humans and test species also may be useful in evaluating factors such as age-related 
differences in the balance between activation and deactivation of a toxic agent. These types of data 
may be helpful in defining the sequence of events leading to an adverse effect and the dose-response 
curve, developing a more accurate comparison of species sensitivity, including that of humans (Wilson 
et al, 1975, 1977), determining dosimetry at target sites, and comparing pharmacokinetic profiles for 
various dosing regimens or routes of exposure. EPA's Office of Prevention, Pesticides, and Toxic 
Substances has published protocols for metabolism studies that may be adapted to provide information 
useful in reproductive toxicity risk assessment for a suspect agent. Phaimacokinetic studies in 
reproductive toxicology are most useful if the data are obtained with animals that are at the same 
reproductive status and stage of life (e.g., pregnant, nonpregnant, embryo or fetus, neonate, 
prepubertal, adult) at which reproductive insults are expected to occur in humans. 



65 



Specific guidance regarding both the development and application of pharmacokinetic data was 
agreed on by the participants of the Workshop on Dermal Developmental Toxicity Studies (Kimmel, 
C.A. and Francis, 1990). This guidance is also applicable to nondermal reproductive toxicity studies. 
Participants of the Workshop concluded that absorption data are needed both when a dermal study 
does or does not show effects. The results of a dermal study showing no effects and without blood 
level data are potentially misleading and are inadequate for risk assessment, especially if interpreted as a 
"negative" study. In studies where adverse effects are detected, regardless of the route of exposure, 
pharmacokinetic data can be used to establish the internal dose in maternal and paternal animals for risk 
extrapolation purposes. 

The existence of a Sertoli cell barrier (formerly called the blood-testis barrier) in the 
seminiferous tubules may influence the pharmacokinetics of an agent with potential to cause testicular 
toxicity by restricting access of compounds to the adluminal compartment of seminiferous tubules. The 
Sertoli cell barrier is formed by tight junctions between Sertoli cells and divides the seminiferous 
epithelium into basal and adluminal compartments (Russell et al., 1990). The basal compartment 
contains the spermatogonia and primary spermatocytes to the preleptotene stage, whereas more 
advanced germ cells are located on the adluminal side. This selectively permeable barrier is most 
effective in limiting the access of large, hydrophilic molecules in the intertubular lymph to cells on the 
adluminal side. An analogous barrier in the ovary has not been found, although the zona pellucida and 
granulosa cells may modulate access of chemicals to oocytes (Crisp, 1992). 

The reproductive organs appear to have a wide range of metabolic capabilities directed at both 
steroid and xenobiotic metabolism. However, there are substantial differences between compartments 
within the organs in types and levels of enzyme activities (Mukhtar et al, 1978). Recognition of these 
differences can be important in understanding the potential of agents to have specific toxic effects. 

Most pharmacokinetic studies have incompletely characterized the distribution of toxic agents 
and their subsequent metabolic fate within the reproductive organs. Generalizations based on hepatic 
metabolism are not necessarily adequate to predict the fate of the agent in the testis, ovary, placenta, or 
conceptus. For example, the metabolic profile for a given agent may differ in the male between the liver 
and the testis and in the female between the maternal liver, ovary, and placenta. Detailed interspecies 
comparisons of the metabolic capabilities of the testis, ovary, placenta, and conceptus also have not 
been conducted. For some xenobiotics, significant differences in metabolism have been identified 
between males and females (Harris, R.Z. et al, 1995). This is, in part, attributable to organizational 
effects of the gonadal steroids in the developing liver (Gustafsson et al, 1980; Skett, 1988). Also, in 
adults, the sex steroids have been shown to affect the activity of a number of enzymes involved in the 
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metabolism of administered compounds. Thus, the blood levels of a toxic agent, as well as the final 
concentration in the target tissue, may differ significantly between sexes. If data are to be used 
effectively in interspecies comparisons and extrapolations for these target systems, more attention 
should be directed to the pharmacokinetic properties of chemicals in the reproductive organs and in 
other organs that are affected by reproductive hormones. 

3.5. COMPARISONS OF MOLECULAR STRUCTURE 

Comparisons of the chemical or physical properties of an agent with those of agents known to 
cause reproductive toxicity may provide some indication of a potential for reproductive toxicity. Such 
information may be helpful in setting priorities for testing of agents or for evaluation of potential toxicity 
when only minimal data are available. Stiucture-activity relationships (SAR) have not been well studied 
in reproductive toxicology, and have had limited success in predicting reproductive toxicity. The early 
literature has been reviewed and a set of classifications offered relating structure to reported male 
reproductive system activity (Bernstein, 1 984). Data are available that suggest structure-activity 
relationships with limited utility in risk assessment for certain classes of chemicals (e.g., glycol ethers, 
some estrogens, androgens, other steroids, substituted phenols, retinoids, phthalate esters, short-chain 
halogenated hydrocarbon pesticides, alkyl-substituted polychlorinated dibenzofurans, PCBs, 
vinylcyclohexene and related olefins, halogenated propanes, metals, and azo dyes). McKinney and 
Waller (1994) hav e studied the qualitative SAR properties of PCBs with respect to their recognition by 
thyroxine, Ah and estrogen receptors. Although generally limited in scope and in need of validation, 
such relationships provide hypotheses that can be tested. 

In spite of the limited information available on SAR in reproductive toxicology, under certain 
circumstances (e.g., in the case of new chemicals), this procedure can be used to evaluate the potential 
for toxicity when little or no other data are available. 

3.6. EVALUATION OF DOSE-RESPONSE RELATIONSHIPS 

The description and evaluation of dose-response relationships is a critical component of the 
hazard characterization. Evidence for a dose-response relationship is an important criterion in 
establishing a toxic reproductive effect. It includes the evaluation of data from both human and 
laboratory animal studies. When possible, pharmacokinetic data should be used to determine the 
effective dose at the target organ(s). When adequate dose-response data are available in humans and 
with a sufficient range of exposure, dose-response relationships in humans may be examined. Because 
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quantitative data on human dose-response relationships are available infrequently, the dose-response 
evaluation is usually based on the assessment of data from tests performed in laboratory animals. 

The dose-response relationships for individual endpoints, as well as the combination of 
endpoints, must be examined in data interpretation. Dose-response evaluations should consider the 
effects that competing risks between different endpoints may have on outcomes observed at different 
exposure levels. For example, an agent may interfere with cell function in such a manner that, at a low 
dose level, an increase in abnormal speim morphology is observed. At higher doses, cell death may 
occur, leading to a decrease in speim counts and a possible decrease in proportion of abnormal sperm. 

When data on several species are available, the selection of the data for the dose-response 
evaluation is based ideally on the response of the species most relevant to humans (e.g., comparable 
physiologic, pharmacologic, pharmacokinetic, and pharmacodynamic processes), the adequacy of 
dosing, the appropriateness of the route of administration, and the endpoints selected. However, 
availability of information on many of those components is usually very limited. For dose-response 
assessment, no single laboratory animal species can be considered the best in all situations for 
predicting risk of reproductive toxicity to humans. However, in some cases, such as in the assessment 
of physiologic parameters related to menstrual disorders, higher nonhuman primates are considered 
generally similar to the human. In the absence of a clearly most relevant species, data from the most 
sensitive species (i.e., the species showing a toxic effect at the lowest administered dose) are used, 
because humans are assumed to be at least as sensitive generally as the most sensitive animal species 
tested (Nisbet and Karch, 1983; Kimmel, C.A. et al., 1984, 1990; Hemminki and Vineis, 1985; 
Meistrich, 1986; Working, 1988). 

The evaluation of dose-response relationships includes the identification of effective dose levels 
as well as doses that are associated with low or no increased incidence of adverse effects compared 
with controls. Much of the focus is on the identification of the critical effect(s) (i.e., the adverse effect 
occurring at the lowest dose level) and the LOAEL and NOAEL or benchmark dose associated with 
the effect(s) (see Section 4). 

Generally, in studies that do not evaluate reproductive toxicity, only adult male and nonpregnant 
females are examined. Therefore, the possibility that pregnant females may be more sensitive to the 
agent is not tested. In studies in which reproductive toxicity has been evaluated, the effective dose 
range should be identified for both reproductive and other forms of systemic toxicity, and should be 
compared with the corresponding values from other adult toxicity data to deteirnine if the pregnant or 
lactating female may be more sensitive to an agent. 
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In addition to identification of the range of doses that is effective in producing reproductive and 
other forms of systemic toxicity for a given agent, the route of exposure, timing and duration of 
exposure, species specificity of effects, and any pharmacokinetic or other considerations that might 
influence the comparison with human exposure scenarios should be identified and evaluated. This 
information should always accompany the characterization of the health-related database (discussed in 
the next section). 

Because the developing organism is changing rapidly and is vulnerable at a number of stages, an 
assumption is made with developmental effects that a single exposure at a critical time in development 
may produce an adverse effect (U.S. EPA, 1991). Therefore, with inhalation exposures, the daily dose 
is usually not adjusted to a 24-hour equivalent duration with developmental toxicity unless appropriate 
pharmacokinetic data are available. However, for other reproductive effects, daily doses by the 
inhalation route may be adjusted for duration of exposure. The Agency is planning to review these 
stances to determine the most appropriate approach for the future. 

3.7. CHARACTERIZATION OF THE HEALTH-RELATED DATABASE 

This section describes evaluation of the health-related database on a particular chemical and 
provides criteria forjudging the potential for that chemical to produce reproductive toxicity under the 
exposure conditions inherent in the database. This determination provides the basis forjudging whether 
the available data are sufficient to characterize a hazard and to conduct quantitative dose-response 
analyses. It also should provide a summary and evaluation of the existing data and identify data gaps 
for an agent that is judged to have insufficient information to proceed with a quantitative dose-response 
analysis. Characterizing the available evidence in this way clarifies the strengths and uncertainties in a 
particular database. It does not address the level of concern, nor does it completely address 
deteirnining relevance of available data for estimating human risk. Issues concerning relevance of 
mechanisms of action and types of effects observed should be included in the hazard characterization. 
Both level of concern and relevance are discussed further as part of the final characterization of risk, 
taking into account the information concerning potential human exposure. Data from all potentially 
relevant studies, whether indicative of potential hazard or not, should be included in the hazard 
characterization. 

A complex interrelationship exists among study design, statistical analysis, and biologic 
significance of the data. Thus, substantial scientific judgment, based on experience with reproductive 
toxicity data and with the principles of study design and statistical analysis, may be required to evaluate 
the database adequately. In some cases, a database may contain conflicting data. In these instances, 
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the risk assessor must consider each study's strengths and weaknesses within the context of the overall 
database to characterize the evidence for assessing the potential hazard for reproductive toxicity. 
Scientific judgment is always necessary and, in many cases, interaction with scientists in specific 
disciplines (e.g., reproductive toxicology, epidemiology, genetic toxicology, statistics) is recommended. 

A scheme forjudging the available evidence on the reproductive toxicity of a particular agent is 
presented below (Table 6). The scheme contains two broad categories, "Sufficient" and "Insufficient," 
which are defined in Table 6. Data from all available studies, whether or not indicative of potential 
concern, are evaluated and used in the hazard characterization for reproductive toxicity. The primary 
considerations are the human data, if available, and the experimental animal data. The judgment of 
whether data are sufficient or insufficient should consider a variety of parameters that contribute to the 
overall quality of the data, such as the power of the studies (e.g., sample size and variation in the data), 
the number and types of endpoints examined, replication of effects, relevance of route and timing of 
exposure for both human and experimental animal studies, and the appropriateness of the test species 
and dose selection in experimental animal studies. In addition, phamiacokinetic data and structure- 
activity considerations, data from other toxicity studies, as well as other factors that may affect the 
overall decision about the evidence, should be taken into account. 

In general, the characterization is based on criteria defined by these Guidelines as the minimum 
evidence necessary to characterize a hazard and conduct dose-response analyses. Establishing the 
minimum human evidence to proceed with quantitative analyses based on the human data is often 
difficult because there may be considerable variations in study designs and study group selection. The 
body of human data should contain convincing evidence as described in the "Sufficient Human 
Evidence" category. Because the human data necessaiy to judge whether or not a causal relationship 
exists are generally limited, few agents can be classified in this categoiy. Agents that have been tested 
in laboratory animals according to EPA's two-generation reproductive effects test guidelines (U.S. 
EPA, 1982, 1985b, 1996a), but not limited to such designs (e.g., a continuous breeding study with two 
generations), generally would be included in the "Sufficient Experimental Animal Evidence/Limited 
Human Data" categoiy. There are occasions in which more limited data regarding the potential 
reproductive toxicity of an agent (e.g., a one-generation reproductive effects study, a standard 
subchronic or chronic toxicity study in which the reproductive organs were well examined) are 
available. If reproductive toxicity is observed in these limited studies, the data may be used to the 
extent possible to reach a decision regarding hazard to the reproductive system, including determination 
of an RfD or Rf€. In cases in which such limited data are available, it would be appropriate to adjust 
the uncertainty factor to reflect the attendant increased uncertainty regarding the use of these data until 
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more definitive data are developed. Identification of the increased uncertainty and justification for the 
adjustment of the uncertainty factor should be stated clearly. 
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Table 6. Categorization of the health-related database 



SUFFICIENT EVIDENCE 

The Sufficient Evidence category includes data that collectively provide enough information to 
judge whether or not a reproductive hazard exists within the context of effect as well as dose, duration, 
timing, and route of exposure. This categoiy may include both human and experimental animal 
evidence. 

Sufficient Human Evidence 

This category includes agents for which there is convincing evidence from epidemiologic studies 
(e.g., case control and cohort) to judge whether exposure is causally related to reproductive toxicity. A 
case series in conjunction with other supporting evidence also may be judged as Sufficient Evidence. 
An evaluation of epidemiologic and clinical case studies should discuss whether the observed effects 
can be considered biologically plausible in relation to chemical exposure. 

Sufficient Experimental Animal Evidence/Limited Human Data 

This categoiy includes agents for which there is sufficient evidence from experimental animal 
studies and/or limited human data to judge if a potential reproductive hazard exists. Generally, agents 
that have been tested according to EPA's two-generation reproductive effects test guidelines (but not 
limited to such designs) would be included in this category. The minimum evidence necessary to 
determine if a potential hazard exists would be data demonstrating an adverse reproductive effect in a 
single appropriate, well-executed study in a single test species. The minimum evidence needed to 
determine that a potential hazard does not exist would include data on an adequate array of endpoints 
from more than one study with two species that showed no adverse reproductive effects at doses that 
were minimally toxic in terms of inducing an adverse effect. Information on pharmacokinetics, 
mechanisms, or known properties of the chemical class may also strengthen the evidence. 

INSUFFICIENT EVIDENCE 

This category includes agents for which there is less than the minimum sufficient evidence 
necessary for assessing the potential for reproductive toxicity. Included are situations such as when no 
data are available on reproductive toxicity; as well as for data bases from studies on test animals or 
humans that have a limited study design or conduct (e.g., small numbers of test animals or human 
subjects, inappropriate dose selection or exposure information, other uncontrolled factors); data from 
studies that examined only a limited number of endpoints and reported no adverse reproductive effects; 
or data bases that were limited to information on structure-activity relationships, short-term or in vitro 
tests, pharmacokinetic data, or metabolic precursors. 
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Because it is more difficult, both biologically and statistically, to support a rinding of no apparent 
hazard, more data are generally required to support this conclusion than a finding for a potential hazard. 
For example, to judge whether a hazard for reproductive toxicity could exist for a given agent, the 
minimurn evidence could be data from a single appropriate, well-executed study in a single test species 
that demonstrates an adverse reproductive effect, or suggestive evidence from adequately conducted 
clinical or epidemiologic studies. As in all situations, it is important that the results be biologically 
plausible and consistent. On the other hand, to judge whether an agent is unlikely to pose a hazard for 
reproductive toxicity, the minimum evidence would include data on an array of endpoints and from 
studies with more than one species that showed no reproductive effects at doses that were otherwise 
minimally toxic to the adult animal. In addition, there may be human data from appropriate studies that 
are supportive of no apparent hazard. In the event that a substantial database exists for a given 
chemical, but no single study meets current test guidelines, the risk assessor should use scientific 
judgment to determine whether the composite database may be viewed as meeting the "Sufficient" 
criteria. 

Some important considerations in detemiining the confidence in the health database are as 

follows: 

Data of equivalent quality from human exposures are given more weight than data from 
exposures of test species. 

Although a single study of high quality could be sufficient to achieve a relatively high level 

of confidence, replication increases the confidence that may be placed in such results. 

Data are available from one or more in vivo studies of acceptable quality with humans or 

other mammalian species that are believed to be predictive of human responses. 

Data exhibit a dose-response relationship. 

Results are statistically significant and biologically plausible. 

When multiple studies are available, the results are consistent. 

Sufficient information is available to reconcile discordant data. 

Route, level, duration, and frequency of exposure are appropriate. 

An adequate array of endpoints has been examined. 

The power and statistical treatment of the studies are appropriate. 
Any statistically significant deviation from baseline levels for an in vivo effect warrants closer 
examination. To determine whether such a deviation constitutes an adverse effect requires an 
understanding of its role within a complex system and the determination of whether a "true effect" has 
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been observed. Application of the above criteria, combined with guidance presented in Section 3.2, 
can facilitate such determinations. 

The greatest confidence for identification of a reproductive hazard should be placed on 
significant adverse effects on sexual behavior, fertility or development, or other endpoints that are 
directly related to reproductive function such as menstrual (estrous) cycle normality, sperm evaluations, 
reproductive histopathology, reproductive organ weights, and reproductive endocrinology. Agents 
producing adverse effects on these endpoints can be assigned to the "Sufficient Evidence" category if 
study quality is adequate. 

Less confidence should be placed in results when only measures such as in vitro tests, data from 
nonmammals, or structure-activity relationships are available, but positive results may trigger follow-up 
studies that extend the preliminary data and determine the extent to which function might be affected. 
Results from these types of studies alone, whether or not they demonstrate an effect, may be suggestive, 
but should be assigned to the "Insufficient Evidence" categoiy. 

The absence of effects in test species on the endpoints that are evaluated routinely (i.e., fertility, 
histopathology, prenatal development, and organ weights) may constitute sufficient evidence to place a 
low priority on the potential reproductive toxicity of a chemical. However, in such cases, careful 
consideration should be given to the sensitivity of these endpoints and to the quality of the data on these 
endpoints. Consideration also should be given to the possibility of adverse effects that may not be 
reflected in these routine measures (e.g., germ-cell mutation, alterations in estrous cyclicity or sperm 
measures such as motility or morphology, functional effects from developmental exposures). 

Judging that the health database indicates a potential reproductive hazard does not mean that 
the agent will be a hazard at every exposure level (because of the assumption of a nonlinear dose- 
response) or in every situation (e.g., the type and degree of hazard may vary significantly depending on 
route and timing of exposure). In the final risk characterization, the summary of the hazard 
characterization should always be presented with information on the quantitative dose-response analysis 
and, if available, with the human exposure estimates. 
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4. QUANTITATIVE DOSE-RESPONSE ANALYSIS 



In quantitative dose-response assessment, a nonlinear dose-response is assumed for noncancer 
health effects unless mode of action or pharmacodynamic information indicate otherwise. If sufficient 
data are available, a biologically based approach should be used on a chemical-specific basis to predict 
the shape of the dose-response curve below the observable range. It is plausible that certain biologic 
processes (e.g., Sertoli cell barrier selectivity, metabolic and repair capabilities of the germ cells) may 
impede the attainment or maintenance of concentrations of the agent at the target site following 
exposure to low-dose levels that would be associated with adverse effects. The assumption of a 
nonlinear dose-response suggests that the application of adequate uncertainty factors to a NOAEL, 
LOAEL, or benchmark dose will result in an exposure level for all humans that is not attended with 
significant risk above background. With a linear dose-response, it is assumed that some risk exists at 
any level of exposure, with risk decreasing as exposure decreases. 

The NOAEL is the highest dose at which there is no significant increase in the frequency of an 
adverse effect in any manifestation of reproductive toxicity compared with the appropriate control 
group in a database having sufficient evidence for use in a risk assessment. The LOAEL is the lowest 
dose at which there is a significant increase in the frequency of adverse reproductive effects compared 
with the appropriate control group in a database having sufficient evidence. A significant increase may 
be based on statistical significance or on a biologically significant trend. Evidence for biological 
significance may be strengthened by mode of action or other biochemical evidence at lower exposure 
levels that supports the causation of such an effect. The existence of a NOAEL in an experimental 
animal study does not show the shape of the dose-response below the observ able range; it only defines 
the highest level of exposure under the conditions of the study that is not associated with a significant 
increase in an adverse effect. Alternatively, mathematical modeling of the dose-response relationship 
may be performed in the experimental range. This approach can be used to determine a benchmark 
dose, which may be used in place of the NOAEL as a point of departure for calculating an RfD, RfC, 
MOE, or other exposure estimates. 

Several limitations in the use of the NOAEL have been described (Kimmel, CA. and Gaylor, 
1988; U.S. EPA, 1995b): (1) Use of the NOAEL focuses only on the dose that is the NOAEL and 
does not incorporate information on the slope of the dose-response curve or the variability in the data; 
(2) Because data variability is not taken into account (i.e., confidence limits are not used), the NOAEL 
will likely be higher with decreasing sample size or poor study conduct, either of which are usually 
associated with increasing variability in the data; (3) The NOAEL is limited to one of the experimental 
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doses; (4) The number and spacing of doses in a study can influence the dose that is chosen for the 
NOAEL; and (5) Because the NOAEL is defined as a dose that does not produce an observable 
change in adverse responses from control levels and is dependent on the power of the study, 
theoretically the risk associated with it may fall anywhere between zero and an incidence just below that 
detectable from control levels (usually in the range of 7% to 10% for quantal data). The 95% upper 
confidence limit on developmental toxicity risk at the NOAEL has been estimated for several data sets 
to be 2% to 6% (Crump, 1984; Gaylor, 1989); similar evaluations have not been conducted on data 
for other reproductive effects. Because of the limitations associated with the use of the NOAEL, the 
Agency is beginning to use the benchmark dose approach for quantitative dose-response evaluation 
when sufficient data are available. 

Calculation and use of the benchmark dose are described in the EPA document The Use of 
the Benchmark Dose Approach in Health Risk Assessment (U.S. EPA, 1995b). The Agency is 
currently developing guidance for application of the benchmark dose, including a decision process to 
use for the various steps in the analysis (U.S. EPA, 1996c). The benchmark dose is based on a model- 
derived estimate of a particular incidence level, such as a 5% or 10% incidence. The BMD/C for a 
given endpoint selves as a consistent point of departure for low-dose extrapolation. In some cases, 
mode of action data may be sufficient to estimate a BMD/C at levels below the observable range for 
the health effect of concern. A benchmark response (BMR) of 5% is usually the lowest level of risk 
that can be estimated adequately for binomial endpoints from standard developmental toxicity studies 
(Allen et al, 1994a,b). For fetal weight, a continuous endpoint, a 5% change from the control mean 
was near the limit of detection for standard prenatal toxicity studies (Kavlock et al., 1995). The 
modeling approaches that have been proposed for developmental toxicity (U.S. EPA, 1995b) are, for 
the most part, curve-fitting models that have biological plausibility, but do not incorporate mode of 
action. Similar approaches can be applied to other reproductive toxicity data to derive dose-response 
curves for data in the observed dose range, but may or may not accurately predict risk at low levels of 
exposure. Further guidance on the use of the BMD/C is being developed by the Agency (U.S. EPA, 
1996c). 

The PvfD or RfC for reproductive toxicity is an estimate of a daily exposure to the human 
population that is assumed to be without appreciable risk of deleterious reproductive effects over a 
lifetime of exposure. The RfD or RfC is derived by applying uncertainty factors to the NOAEL, or the 
LOAEL if a NOAEL is not available, or to the benchmark dose. Because of the short duration of most 
studies of developmental toxicity, a unique value (RfD DX or RfC DT ) is determined for adverse 
developmental effects. For adverse reproductive effects on endpoints other than those of 
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developmental toxicity, no special designator is attached. Data on reproductive toxicity (including 
developmental toxicity) are considered along with other data on a particular chemical in deriving an RfD 
orRfC. 

The effect used for determining the NOAEL, LOAEL, or benchmark dose in deriving the RfD 
or RfC is the most sensitive adverse reproductive endpoint (i.e., the critical effect) from the most 
appropriate or, in the absence of such information, the most sensitive mammalian species (see Sections 
2 and 3.2.1). Uncertainty factors for reproductive and other forms of systemic toxicity applied to the 
NOAEL or benchmark dose generally include factors of 3 or 10 each for interspecies variation and for 
intraspecies variation. Additional factors may be applied to account for other uncertainties that may 
exist in the database. In circumstances where only a LOAEL is available, the use of an additional 
uncertainty factor of up to 10 may be required, depending on the sensitivity of the endpoints evaluated, 
adequacy of dose levels tested, or general confidence in the LOAEL. 

Other areas of uncertainty may be identified and modifying factors used depending on the 
characterization of the database (e.g., if the only data av ailable are from a one-generation reproductive 
effects study; see Section 3.7), data on pharmacokinetics, or other considerations that may alter the 
level of confidence in the data (U.S. EPA, 1987). The total size of the uncertainty factor will vary from 
agent to agent and requires scientific judgment, taking into account interspecies differences, variability 
within species, the slope of the dose-response curve, the types of reproductive effects observed, the 
background incidence of the effects, the route of administration, and pharmacokinetic data. 

The NOAEL, LOAEL, or the benchmark dose is divided by the total uncertainty factor 
selected for the critical effect in the most appropriate or most sensitive mammalian species to deterrnine 
the RfD or RfC. If the NOAEL, LOAEL, or benchmark dose for other forms of systemic toxicity is 
lower than that for reproductive toxicity, this should be noted in the risk characterization, and this value 
should be compared with data from other studies in which adult animals are exposed. Thus, 
reproductive toxicity data should be discussed in the context of other toxicity data. 

It has generally been assumed that there is a nonlinear dose-response for reproductive toxicity. 
This is based on known homeostatic, compensatory, or adaptive mechanisms that must be overcome 
before a toxic endpoint is manifested and on the rationale that cells and organs of the reproductive 
system and the developing organism are known to have some capacity for repair of damage. However, 
in a population, background levels of toxic agents and preexisting conditions may increase the sensitivity 
of some individuals in the population. Thus, exposure to a toxic agent may result in an increased risk of 
adverse effects for some, but not necessarily all, individuals within the population. 
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Efforts are underway to develop models that are more biologically based. These models should 
provide a more accurate estimation of low-dose risk to humans. The development of biologically 
based dose-response models in reproductive toxicology has been impeded by a number of factors, 
including limited understanding of the biologic mechanisms underlying reproductive toxicity, intra- and 
interspecies differences in the types of reproductive events, lack of appropriate pharmacokinetic data, 
and inadequate information on the influence of other types of systemic toxicity on the dose-response 
curve. Current research on modes of action in reproductive toxicology is promising and may provide 
data that are useful for appropriate modeling in the near future. 

4.1. UTILIZATION OF INFORMATION IN RISK CHARACTERIZATION 

The hazard characterization and quantitative dose-response evaluations are incorporated into 
the final characterization of risk along with information on estimates of human exposure. The analysis 
depends on and should describe scientific judgments as to the accuracy and sufficiency of the health- 
related data in experimental animals and humans (if available), the biologic relevance of significant 
effects, and other considerations important in the interpretation and application of data to humans. 
Scientific judgment is always necessaiy, and in many cases, interaction with scientists in specific 
disciplines (e.g., reproductive toxicology, epidemiology, statistics) is recommended. 
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5. EXPOSURE ASSESSMENT 



To obtain a quantitative estimate of risk for the human population, an estimate of human 
exposure is required. The Guidelines for Exposure Assessment (U.S. EPA, 1992) have been 
published separately and will not be discussed in detail here. Rather, issues important to reproductive 
toxicity risk assessment are addressed. In general, the exposure assessment describes the magnitude, 
duration, schedule, and route of human exposure. Ideally, existing body burden as well as internal 
circulating and target organ exposure information for the agent of concern and other synergistic or 
antagonistic agents should be described. It should include information on the purpose, scope, level of 
detail and approach used, including estimates of exposure and dose by pathway and route for 
populations, subpopulations, and individuals in a manner that is appropriate for the intended risk 
characterization. It also should provide an evaluation of the overall level of confidence in the estimate(s) 
of exposure and dose and the conclusions drawn. This information is usually developed from 
monitoring data, from estimates based on modeling of environmental exposures, and from application of 
paradigms to exposure data bases. Often quantitative estimates of exposures may not be available 
(e.g., workplace or environmental measurements). In such instances, employment or residential 
histories also may be used in characterizing exposure in a qualitative sense. The potential use of 
biomarkers as indicators of exposure is an area of active interest. 

Studies of occupational populations may provide valuable information on the potential 
environmental health risks for certain agents. Exposures among environmentally exposed human 
populations tend to be lower (but of longer duration) than those in studies of occupationally exposed 
populations and therefore may require more observations to assure sufficient statistical power. Also, 
reconstruction of exposures is more difficult in an environmental study than in those done in workplace 
settings where industrial hygiene monitoring may provide more detailed exposure data. 

The nature of the exposure may be defined at a particular point in time or may reflect 
cumulative exposure. Each approach makes an assumption about the underlying relationship between 
exposure and outcome. For example, a cumulative exposure measure assumes that total exposure is 
important, with a greater probability of effect with greater total exposure or body burden. A 
dichotomous exposure measure (ever exposed versus never exposed) assumes an irreversible effect of 
exposure. Models that define exposure only at a specific time may assume that only the present 
exposure is important (Selevan and Lemasters, 1987). The appropriate exposure model depends on 
the biologic processes affected and the nature of the chemical under study. Thus, a cumulative or 
dichotomous exposure model may be appropriate if injury occurs in cells that cannot be replaced or 
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repaired (e.g., oocytes); on the other hand, a concurrent exposure model may be appropriate for cells 
that are being generated continually (e.g., spermatids). 

There are a number of unique considerations regarding the exposure assessment for 
reproductive toxicity. Exposure at different stages of male and female development can result in 
different outcomes. Such age-dependent variation has been well documented in both experimental 
animal and human studies. Prenatal and neonatal treatment can irreversibly alter reproductive function 
and other aspects of development in a manner or to an extent that may not be predicted from adult-only 
exposure. Moreover, chemicals that alter sexual differentiation in rodents during these periods may 
have similar effects in humans, because the mechanisms underlying these developmental processes 
appear to be similar in all mammalian species (Gray, 1991). 

The susceptibility of elderly males and females to chemical insult has not been well studied. 
Although procreative competence may not be a major health concern with elderly individuals, other 
biologic functions maintained by the gonads (e.g., hormone production) are of significance (Walker, 
1986). An exposure assessment should characterize the likelihood of exposure of these different 
subgroups (embiyo or fetus, neonate, juvenile, young adult, older adult) and the risk assessment should 
factor in the susceptibility of different age groups to the extent possible. 

The relationship between time or duration of exposure and observation of male reproductive 
effects has particular significance for short-term exposures. Spermatogenesis is a temporally 
synchronized process. In humans, germ cells that were spermatozoa, spermatids, spermatocytes, or 
spermatogonia at the time of an acute exposure require 1 to 2, 3 to 5, 5 to 8, or 8 to 12 weeks, 
respectively, to appear in an ejaculate. That timing may vaiy somewhat depending on degree of sexual 
activity. It is possible that an endpoint may be examined too early or too late to detect an effect if only 
a particular cell type was affected during a relatively brief exposure to an agent. The absence of an 
effect when observations were made too late suggests either a reversible effect or no effect. However, 
an effect that is reversible at lower exposures might become irreversible with higher or longer exposures 
or exposure of a more susceptible individual. Thus, the failure to detect transient effects because of 
improper timing of observations may be important. If information is available on the type of effect 
expected from a class of agents, it may be possible to evaluate whether the timing of endpoint 
measurement relative to the timing of the short-term exposure is appropriate. Some information on the 
appropriateness of the protocol can be obtained if test animal data are available to identify the most 
sensitive cell type or the putative mechanism of action for a given agent. 

Compared with acute exposures, the link between exposure and outcome may be more 
apparent with relatively constant subchronic or longer exposures that are of sufficient duration to cover 
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all phases of spermatogenesis (Russell et al, 1990). Assessments may be made at any time after this 
point as long as exposure remains constant. Time required for the agent or metabolite to attain steady- 
state levels should also be considered. Again, application of models of exposure (e.g., dichotomous, 
concurrent, or cumulative) depends on the suspected target and chemical mechanism of action. 

The reversibility of an adverse effect on the reproductive system can be affected by the degree 
and duration of exposure (Clegg, 1995). The degree of stem cell loss is inversely related to the degree 
of restoration of sperm production, because repopulation of the germinal epithelium is dependent on the 
stem cells (Meistrich, 1982; Foote and Bemdtson, 1992). For agents that bioaccumulate, increasing 
duration of exposure may also increase the extent of damage to the stem cell population. Damage to 
other spermatogenic cell types reduces the number of sperm produced, but recovery should occur 
when the toxic agent is removed. Less is known about the effects of toxicity on the Sertoli cells. 
Temporary impairment of Sertoli cell function may produce long-lasting effects on spermatogenesis. 
Destruction of Sertoli cells or interference with their proliferation before puberty are irreversible effects 
because replication ceases after puberty. Sertoli cells are essential for support of the spermatogenic 
process and loss of those cells results in a permanent reduction of spermatogenic capability (Foster, 
1992). 

When recovery is possible, the duration of the recovery period is determined by the time for 
regeneration (for stem cells) and repopulation of the affected spermatogenic cell types and appearance 
of those cells as sperm in the ejaculate. The time required for these events to occur varies with the 
species, the pharmacokinetic properties of the agent, the extent to which the stem cell population has 
been destroyed, and the degree of sublethal toxicity inflicted on the stem cells or Sertoli cells. When 
the stem cell population has been partially destroyed, humans require more time than mice to reach the 
same degree of recovery (Meistrich and Samuels, 1985). 

Unique considerations in the assessment of female reproductive toxicity include the duration and 
period of exposure as related to the development or stage of reproductive life (e.g., prenatal, 
prepubescent, reproductive, or postmenopausal) or considerations of different physiologic states (e.g., 
nonpregnant, pregnant, lactating). For infertility, a cumulative exposure measure assumes destruction of 
increasing numbers of primary oocytes with greater lifetime exposure or increasing body burden. 
However, humans may be exposed to varying levels of an agent within the study period. Exposures 
during certain critical points in the reproductive process may affect the outcomes observed in humans 
(Lemasters and Selevan, 1984). In test species, perinatal exposure to androgens or estrogens such as 
zearalenone, methoxychlor, and DDT (Bulger and Kupfer, 1985; Gray et al, 1985) have been shown 
to advance puberty and masculinize females. Similar effects have been reported in humans (both sexes) 



81 



exposed neonatally to synthetic estrogens or progestins (Steinberger and Lloyd, 1985; Schardein, 
1993). Studies using test species also have shown that exposure to some environmental agents such as 
ionizing radiation (Dobson and Felton, 1983) and glycol ethers (Heindel et al, 1989) can deplete the 
pool of primordial follicles and thus significantly shorten the female's reproductive lifespan. 
Furthermore, exposure to compounds at different stages of the ovarian cycle can disrupt or delay 
follicular recruitment and development (Armstrong, 1986), ovulation (Everett and Sawyer, 1950; 
Terranova, 1980), and ovum transport (Cummings and Perreault, 1990). Compounds that delay 
ovulation can lead to significant alterations in egg viability (Peluso et al., 1979), fertilizability of the egg 
(Fugo and Butcher, 1966; Butcher and Fugo, 1967; Butcher et al., 1975), and a reduction in litter size 
(Fugo and Butcher, 1966). After ovulation, single exposures to microtubule poisons such as 
carbendazim may impair the completion of meiosis in the fertilized oocyte with adverse developmental 
consequences (Perreault et al, 1992; Zuelke and Perreault, 1995). Thus, knowledge of when acute 
exposures occur relative to the female's lifespan and reproductive cycle can provide insight into how an 
agent disrupts reproductive function. 

DES is a classic example of an agent causing different effects on the reproductive system in the 
developing organism compared with those in adults (McLachlan, 1980). DES, as well as other agents 
with estrogenic or anti-androgenic activity, interferes with the development of the Mullerian and 
Wolffian duct systems and thereby causes irreversible structural and functional damage to the 
developing reproductive system. In adults, the reproductive effects that are caused by the estrogenic 
activity of DES do not necessarily result in permanent damage. 

Unique considerations for developmental effects are duration and period of exposure as related 
to stage of development (i.e., critical periods) and the possibility that even a single exposure may be 
sufficient to produce adverse developmental effects. Repeated exposure is not a necessary prerequisite 
for developmental toxicity to be manifested, although it should be considered in cases where there is 
evidence of cumulative exposure or where the half-life of the agent is long enough to produce an 
increasing body burden over time. For these reasons, it is assumed that, in most cases, a single 
exposure at the critical time in development is sufficient to produce an adverse developmental effect. 
Therefore, the human exposure estimates used to calculate the MOE for an adverse developmental 
effect or to compare to the RfD or RfC are usually based on a single daily dose that is not adjusted for 
duration or partem (e.g., continuous or intermittent) of exposure. For example, it would be 
inappropriate to use time-weighted averages or adjustment of exposure over a different time frame than 
that actually encountered (such as the adjustment of a 6-hour inhalation exposure to account for a 24- 
hour exposure scenario) unless pharmacokinetic data were available to indicate an accumulation with 
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continuous exposure. In the case of intermittent exposures, examination of the peak exposures as well 
as the average exposure over the time of exposure would be important. 

It should be recognized that, based on the definitions used in these Guidelines, almost any 
segment of the human population may be at risk for a reproductive effect. Although the reproductive 
effects of exposures may be manifested while the exposure is occurring (e.g., menstrual disorder, 
decreased sperm count, spontaneous abortion) some effects may not be detectable until later in life 
(e.g., endocrine disruption of reproductive tract development, premature reproductive senescence due 
to oocyte depletion), long after exposure has ceased. 
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6. RISK CHARACTERIZATION 



6.1. OVERVIEW 

A risk characterization is an essential part of any Agency report on risk whether the report is a 
preliminary one prepared to support allocation of resources toward further study, a site-specific 
assessment, or a comprehensive one prepared to support regulatory decisions. A risk characterization 
should be prepared in a manner that is clear, reasonable, and consistent with other risk 
characterizations of similar scope prepared across programs in the Agency. It should identify and 
discuss all the major issues associated with determining the nature and extent of the risk and provide 
commentary on any constraints limiting more complete exposition. The key aspects of risk 
characterization are: (1) bridging risk assessment and risk management, (2) discussing confidence and 
uncertainties, and (3) presenting several types of risk information. In this final step of a risk assessment, 
the risk characterization involves integration of toxicity information from the hazard characterization and 
quantitative dose-response analysis with the human exposure estimates and provides an evaluation of 
the overall quality of the assessment, describes risk in terms of the nature and extent of harm, and 
communicates results of the risk assessment to a risk manager. A risk manager can then use the risk 
assessment, along with other risk management elements, to make public health decisions. The 
information should also assist others outside the Agency in understanding the scientific basis for 
regulatory decisions. 

Risk characterization is intended to summarize key aspects of the following components of the 
risk assessment: 

The nature, reliability, and consistency of the data used. 

The reasons for selection of the key study(ies) and the critical effect(s) and their relevance 
to human outcomes. 

The qualitative and quantitative descriptors of the results of the risk assessment. 

The limitations of the available data, the assumptions used to bridge knowledge gaps in 

working with those data, and implications of using alternative assumptions. 

The strengths and weaknesses of the risk assessment and the level of scientific confidence 

in the assessment. 

The areas of uncertainty, additional data/research needs to improve confidence in the risk 
assessment, and the potential impacts of the new research. 
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The risk characterization should be limited to the most significant and relevant data, conclusions, 
and uncertainties. When special circumstances exist that preclude full assessment, those circumstances 
should be explained and the related limitations identified. 

The following sections describe these aspects of the risk characterization in more detail, but do 
not attempt to provide a full discussion of risk characterization. Rather, these Guidelines point out 
issues that are important to risk characterization for reproductive toxicity. Comprehensive general 
guidance for risk characterization is provided by Habicht (1992) and Browner (1995). 

6.2. INTEGRATION OF HAZARD CHARACTERIZATION, QUANTITATIVE DOSE- 
RESPONSE, AND EXPOSURE ASSESSMENTS 

In developing each component of the risk assessment, risk assessors must make judgments 
concerning human relevance of the toxicity data, including the appropriateness of the various test animal 
models for which data are available, and the route, timing, and duration of exposure relative to the 
expected human exposure. These judgments should be summarized at each stage of the risk 
assessment process. When data are not available to make such judgments, as is often the case, the 
background information and assumptions discussed in the Overview (Section 1) provide default 
positions. The default positions used and the rationale behind the use of each default position should be 
clearly stated. In integrating the parts of the assessment, risk assessors must determine if some of these 
judgments have implications for other portions of the assessment, and whether the various components 
of the assessment are compatible. 

The description of the relevant data should convey the major strengths and weaknesses of the 
assessment that arise from availability and quality of data and the current limits of understanding of the 
mechanisms of toxicity. Confidence in the results of a risk assessment is a function of confidence in the 
results of these analyses. Each section (hazard characterization, quantitative dose-response analysis, 
and exposure assessment) should have its own summary, and these summaries should be integrated into 
the overall risk characterization. Interpretation of data should be explained, and risk managers should 
be given a clear picture of consensus or lack of consensus that exists about significant aspects of the 
assessment. When more than one interpretation is supported by the data, the alternative plausible 
approaches should be presented along with the strengths, weaknesses, and impacts of those options. If 
one interpretation or option has been selected over another, the rationale should be given; if not, then 
both should be presented as plausible alternatives. 
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The risk characterization should not only examine the judgments, but also should explain the 
constraints of available data and the state of knowledge about the phenomena studied in making them, 
including: 

The qualitative conclusions about the likelihood that the chemical may pose a specific 
hazard to human health, the nature of the observed effects, under what conditions (route, 
dose levels, time, and duration) of exposure these effects occur, and whether the health- 
related data are sufficient and relevant to use in a risk assessment. 
A discussion of the dose-response patterns for the critical effects) and their relationships 
to the occurrence of other toxicity data, such as the shapes and slopes of the dose- 
response curves for the various other endpoints; the rationale behind the determination of 
the NOAEL, LOAEL, and/or benchmark dose; and the assumptions underlying the 
estimation of the RfD, RfC, or other exposure estimate. 

Descriptions of the estimates of the range of human exposure (e.g., central tendency, high 
end), the route, duration, and pattern of the exposure, relevant pharmacokinetics, and the 
size and characteristics of the various populations that might be exposed. 
The risk characterization of an agent being assessed for reproductive toxicity should be 
based on data from the most appropriate species or, if such information is not available, on 
the most sensitive species tested. It also should be based on the most sensitive indicator of 
an adverse reproductive effect, whether in the male, the female (nonpregnant or pregnant), 
or the developing organism, and should be considered in relation to other forms of toxicity. 
The relevance of this indicator to human reproductive outcomes should be described. The 
rationale for those decisions should be presented. 
If data to be used in a risk characterization are from a route of exposure other than the 
expected human exposure, then pharmacokinetic data should be used, if available, to extrapolate 
across routes of exposure. If such data are not available, the Agency makes certain assumptions 
concerning the amount of absorption likely or the applicability of the data from one route to another 
(U.S. EPA, 1985a, 1986b). Discussion of some of these issues may be found in the Proceedings of 
the Workshop on Acceptability and Interpretation of Dermal Developmental Toxicity Studies 
(Kimmel, CA. and Francis, 1990) and Principles of Route-to-Route Extrapolation for Risk 
Assessment (Geirity et al., 1990). The risk characterization should identify the methods used to 
extrapolate across exposure routes and discuss the strengths and limitations of the approach. 

The level of confidence in the hazard characterization and quantitative dose-response evaluation 
should be stated to the extent possible, including placement of the agent into the appropriate category 
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regarding the sufficiency of the health-related data (see Section 3.7). A comprehensive risk assessment 
ideally includes information on a variety of endpoints that provide insight into the full spectrum of 
potential reproductive responses. A profile that integrates both human and test species data and 
incorporates both sensitive endpoints (e.g., properly performed and fully evaluated histopathology) and 
functional correlates (e.g., fertility) allows more confidence in a risk assessment for a given agent. 

Descriptions of the nature of potential human exposures are important for prediction of specific 
outcomes and the likelihood of persistence or reversibility of the effect in different exposure situations 
with different subpopulations (U.S. EPA, 1992; Clegg, 1995). 

In the risk assessment process, risk is estimated as a function of exposure, with the risk of 
adverse effects increasing as exposure increases. Information on the levels of exposure experienced by 
different members of the population is key to understanding the range of risks that may occur. Where 
possible, several descriptors of exposure such as the nature and range of populations and their various 
exposure conditions, central tendencies, and high-end exposure estimates should be presented. 
Differences among individuals in absoiption rates, metabolism, or other factors mean that individuals or 
subpopulations with the same level and partem of exposure may have differing susceptibility. For 
example, the consequences of exposure can differ markedly between developing individuals, young 
adults and aged adults, including whether the effects are permanent or transient. Other considerations 
relative to human exposures might include pregnancy or lactation, potential for exposures to other 
agents, concuirent disease, nutritional status, lifestyle, ethnic background and genetic polymorphism, 
and the possible consequences. Knowledge of the molecular events leading to induction of adverse 
effects may be of use in determining the range of susceptibility in sensitive populations. 

An outline to serve as a guide and foimatting aid for developing reproductive risk 
characterizations for chemical-specific risk assessments can be found in Table 7. A common format 
will assist risk managers in evaluating and using reproductive risk characterization. The outline has two 
parts. The first part tracks the reproductive risk assessment to bring foiward its major conclusions. 
The second part pulls the information together to characterize the reproductive risk. 

6.3. DESCRIPTORS OF REPRODUCTIVE RISK 

Descriptors of reproductive risk convey information and answer questions about risk, with each 
descriptor providing different information and insights. There are a number of ways to describe risk. 
Details on how to use these descriptors can be obtained from the guidance on risk characterization 
(Browner, 1995) from which some of the information below has been extracted. 
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In most cases, the state of the science is not yet adequate to define distributions of factors such 
as population susceptibility. The guidance principles below discuss a variety of risk descriptors that 
primarily reflect differences in estimated exposure. If a full description of the range of susceptibility in 
the population cannot be presented, an effort should be made to identify subgroups that, for various 
reasons, may be particularly susceptible. 
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Table 7. Guide for developing chemical-specific risk characterizations for reproductive 
effects 



PART ONE 

Summarizing Major Conclusions in Risk Characterization 

I. Hazard Characterization 

A. What is (are) the key toxicological study (or studies) that provides the basis for health 
concerns for reproductive effects? 

How good is the key study? 

Are the data from laboratory or field studies? In a single or multiple species? 
What adverse reproductive endpoints were observed, and what is the basis for the 
critical effect? 

Describe other studies that support this finding. 
Discuss any valid studies which conflict with this finding. 

B. Besides the reproductive effect observed in the key study, are there other health endpoints 
of concern? What are the significant data gaps'? 

C. Discuss available epidemiological or clinical data. For epidemiological studies: 

What types of data were used (e.g., human ecologic, case-control or cohort studies, 
or case reports or series)? 

Describe the degree to which exposures were described. 
Describe the degree to which confounding factors were considered. 
Describe the degree to which other causal factors were excluded. 

D. How much is known about how (through what biological mechanism) the chemical 
produces adverse reproductive effects? 

Discuss relevant studies of mechanisms of action or metabolism. 
Does this information aid in the interpretation of the toxicity data? 
What are the implications for potential adverse reproductive effects? 

E. Comment on any nonpositive data in animals or people, and whether these data were 
considered in the hazard characterization. 

F. If adverse health effects have been observed in wildlife species, characterize such effects 
by discussing the relevant issues as in A through E above. 

G. Summarize the hazard characterization and discuss the significance of each of the 
following: 

Confidence in conclusions 

Alternative conclusions that are also supported by the data 
Significant data gaps 
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Highlights of major assumptions 
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Table 7. Guide for developing chemical-specific risk characterizations for reproductive 
effects (continued) 



II. Characterization of Dose-Response 

A. What data were used to develop the dose-response curve? Would the result have been 
significantly different if based on a different data set? 

If laboratory animal data were used: 

Which species were used? 

Most sensitive, average of all species, or other? 

Were any studies excluded? Why? 

If epidemiological data were used: 

Which studies were used? 

Only positive studies, all studies, or some other combination? 
Were any studies excluded? Why? 

Was a meta-analysis performed to combine the epidemiological studies? What 

approach was used? 

Were studies excluded? Why? 

B. Was a model used to develop the dose-response curve and, if so, which one? What 
rationale supports this choice'? Is chemical-specific information available to support this 
approach? 

How was the RfD/RfC (or the acceptable range) calculated? 
What assumptions and uncertainty factors were used? 
What is the confidence in the estimates? 

C. Discuss the route, level, and duration of exposure observed, as compared to expected 
human exposures. 

Are the available data from the same route of exposure as the expected human 
exposures? If not, are pharmacokinetic data available to extrapolate across route of 
exposure? 

How far does one need to extrapolate from the observed data to environmental 
exposures? One to two orders of magnitude? Multiple orders of magnitude? What 
is the impact of such an extrapolation? 

D. If adverse health effects have been observed in wildlife species, characterize 
dose-response information using the process outlined in A through C above. 

III. Characterization of Exposure 

A. What are the most significant sources of environmental exposure? 
Are there data on sources of exposure from different media? 
What is the relative contribution of different sources of exposure? 
What are the most significant environmental pathways for exposure? 
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Table 7. Guide for developing chemical-specific risk characterizations for reproductive 
effects (continued) 



B. Describe the populations that were assessed, including the general population, highly 
exposed groups, and highly susceptible groups. 

C. Describe the basis for the exposure assessment, including any monitoring, modeling, or 
other analyses of exposure distributions such as Monte Carlo or krieging. 

D. What are the key descriptors of exposure? 

Describe the (range of) exposures to: "average" individuals, "high-end" individuals, 

general population, high exposure group(s), children, susceptible populations, males, 

females (nonpregnant, pregnant, lactating). 

How was the cental tendency estimate developed? 

What factors and/or methods were used in developing this estimate? 

How was the high-end estimate developed? 

Is there information on highly exposed subgroups? 

Who are they? 

What are their levels of exposure? 

How are they accounted for in the assessment? 

E. Is there reason to be concerned about cumulative or multiple exposures because of 
biological, ethnic, racial, or socioeconomic reasons? 

F. If adverse reproductive effects have been observed in wildlife species, characterize wildlife 
exposure by discussing the relevant issues as in A through E above. 

G. Summarize exposure conclusions and discuss the following: 

Results of different approaches, i.e., modeling, monitoring, probability distributions; 
Limitations of each, and the range of most reasonable values; 
Confidence in the results obtained, and the limitations to the results 

PART TWO 

Risk Conclusions and Comparisons 

IV. Risk Conclusions 

A. What is the overall picture of risk, based on the hazard, quantitative dose-response, and 
exposure characterizations? 

B. What are the major conclusions and strengths of the assessment in each of the three main 
analyses (i.e., hazard characterization, quantitative dose-response, and exposure 
assessment)? 
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Table 7. Guide for developing chemical-specific risk characterizations for reproductive 
effects (continued) 



C . What are the major limitations and uncertainties in the three main analyses? 

D. What are the science policy options in each of the three major analyses? 
What are the alternative approaches evaluated? 

What are the reasons for the choices made? 

V. Risk Context 

A. What are the qualitative characteristics of the reproductive hazard (e.g., voluntary vs. 
involuntary, technological vs. natural, etc.)? Comment on findings, if any, from studies of 
risk perception that relate to this hazard or similar hazards. 

B. What are the alternatives to this reproductive hazard? How do the risks compare? 

C. How does this reproductive risk compare to other risks? 

How does this risk compare to other risks in this regulatory program, or other similar risks 
that the EPA has made decisions about'? 

Where appropriate, can this risk be compared with past Agency decisions, decisions by 
other federal or state agencies, or common risks with which people may be familiar? 
Describe the limitations of making these comparisons. 

D. Comment on significant community concerns which influence public perception of risk. 

VI. Existing Risk Information 

Comment on other reproductive risk assessments that have been done on this chemical by 
EPA, other federal agencies, or other organizations. Are there significantly different conclusions 
that merit discussion? 

VII. Other Information 

Is there other information that would be useful to the risk manager or the public in this situation 
that has not been described above? 
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6.3.1. Distribution of Individual Exposures 

Risk managers are interested generally in answers to questions such as: (1) Who are the 
people at the highest risk and why? (2) What is the average risk or distribution of risks for individuals 
in the population of interest? and (3) What are they doing, where do they live, etc., that might be putting 
them at this higher risk? 

Exposure and reproductive risk descriptors for individuals are intended to provide answers to 
these questions. To describe the range of risks, both high-end and central tendency descriptors are 
used to convey the distribution in risk levels experienced by different individuals in the population. For 
the Agency's purposes, high-end risk descriptors are plausible estimates of the individual risk for those 
persons at the upper end of the risk distribution. Given limitations in current understanding of variability 
in individuals' sensitivity to agents that cause reproductive toxicity, high-end descriptors will usually 
address high-end exposure or dose. Conceptually, high-end exposure means exposure above 
approximately the 90th percentile of the population distribution, but not higher than the individual in the 
population who has the highest exposure. Central tendency descriptors generally reflect central 
estimates of exposure or dose. The descriptor addressing central tendency may be based on either the 
arithmetic mean exposure (average estimate) or the median exposure (median estimate), either of which 
should be clearly labeled. The selection of which descriptor(s) to present in the risk characterization 
will depend on the available data and the goals of the assessment. 

6.3.2. Population Exposure 

Population risk refers to assessment of the extent of harm for the population as a whole. In 
theory, it can be calculated by summing the individual risks for all individuals within the subject 
population. That task requires more information than is usually available. Questions addressed by 
descriptors of population risk for reproductive effects would include: What portion of the population is 
within a specified range of some reference level, e.g., exceeds the RfD (a dose), the RfC (a 
concentration), or other health concern level? 

For reproductive effects, risk assessment techniques have not been developed generally to the 
point of knowing how to add risk probabilities, although Hattis and Silver (1994) have proposed 
approaches for certain case-specific situations. Therefore, the following descriptor is usually 
appropriate: An estimate of the percentage of the population, or the number of persons, above a 
specified level of risk or within a specified range of some reference level (e.g., exceeds the RfD, RfC, 
LOAEL, or other specific level of interest). The RfD or RfC is assumed to be a level below which no 
significant risk occurs. Therefore, information from the exposure assessment on the populations below 
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the RfD or RfC ("not likely to be at risk") and above the RfD or RfC ("may be at risk") may be useful 
information for risk managers. Estimating the number of persons potentially removed from the "may be 
at risk" category after a contemplated action is taken may be particularly useful to a risk manager 
considering possible actions to ameliorate risk for a population. This descriptor must be obtained 
through measuring or simulating the population distribution. 

6.3.3. Margin of Exposure 

In the risk characterization, dose-response information and the human exposure estimates may 
be combined either by comparing the RfD or RfC and the human exposure estimate or by calculating 
the margin of exposure (MOE). The MOE is the ratio of the NOAEL or benchmark dose from the 
most appropriate or sensitive species to the estimated human exposure level from all potential sources 
(U.S. EPA, 1985a). If a NOAEL is not available, a LOAEL may be used in the calculation of the 
MOE, but consideration for the acceptability would be different than when a NOAEL is used. 
Considerations for the acceptability of the MOE are similar to those for the selection of uncertainty 
factors applied to the NOAEL, LOAEL, or the benchmark dose for the derivation of an RfD. The 
MOE is presented along with the characterization of the database, including the strengths and 
weaknesses of the toxicity and exposure data, the number of species affected, and the information on 
dose-response, route, timing, and duration. The RfD or RfC comparison with the human exposure 
estimate and the calculation of the MOE are conceptually similar, but may be used in different 
regulatory situations. 

The choice of approach is dependent on several factors, including the statute involved, the 
situation being addressed, the database used, and the needs of the decisionmaker. The RfD, RfC, or 
MOE are considered along with other risk assessment and risk management issues in making risk 
management decisions, but the scientific issues that should be taken into account in establishing them 
have been addressed here. 

6.3.4. Distribution of Exposure and Risk for Different Subgroups 

A risk manager might also ask questions about the distribution of the risk burden among various 
segments of the subject population such as the following: How do exposure and reproductive risk 
impact various subgroups? and What is the population risk of a particular subgroup? Questions about 
the distribution of exposure and reproductive risk among such population segments require additional 
risk descriptors. 
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6.3.4.1. Highly Exposed 

The purpose of this measure is to describe the upper end of the exposure distribution, allowing 
risk managers to evaluate whether certain individuals are at disproportionately high or unacceptably high 
risk. The objective is to look at the upper end of the exposure distribution to derive a realistic estimate 
of relatively highly exposed individual(s). The "high end" of the risk distribution has been defined 
(Habicht, 1992; Browner, 1995) as above the 90th percentile of the actual (either measured or 
estimated) distribution. Whenever possible, it is important to express the number or proportion of 
individuals who comprise the selected highly exposed group and, if data are available, discuss the 
potential for exposure at still higher levels. 

Highly exposed subgroups can be identified and, where possible, characterized, and the 
magnitude of risk quantified. This descriptor is useful when there is (or is expected to be) a subgroup 
experiencing significantly different exposures or doses from those of the larger population. These 
subpopulations may be identified by age, sex, lifestyle, economic factors, or other demographic 
variables. For example, toddlers who play in contaminated soil and consumers of large amounts of fish 
represent subpopulations that may have greater exposures to certain agents. 

If population data are absent, it will often be possible to describe a scenario representing high- 
end exposures using upper percentile or judgment-based values for exposure variables. In these 
instances, caution should be taken not to overestimate the high-end values if a "reasonable" exposure 
estimate is to be achieved. 

6.3.4.2. Highly Susceptible 

Highly susceptible subgroups also can be identified and, if possible, characterized, and the 
magnitude of risk quantified. This descriptor is useful when the sensitivity or susceptibility to the effect 
for specific subgroups is (or is expected to be) significantly different from that of the larger population. 
Therefore, the purpose of this measure is to quantify exposure of identified sensitive or susceptible 
populations to the agent of concern. Sensitive or susceptible individuals are those within the exposed 
population at increased risk of expressing the adverse effect. Examples might be pregnant or lactating 
women, women with reduced oocyte numbers, men with "borderline" sperm counts, or infants. To 
calculate risk for these subgroups, it will be necessary sometimes to use a different dose-response 
relationship; e.g., upon exposure to a chemical, pregnant or lactating women, elderly people, children of 
varying ages, and people with certain illnesses may each be more sensitive than the population as a 
whole. 
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In general, not enough is understood about the mechanisms of toxicity to identify sensitive 
subgroups for most agents, although factors such as age, nutrition, personal habits (e.g., smoking, 
consumption of alcohol, and abuse of drugs), existing disease (e.g., diabetes or sexually transmitted 
diseases), or genetic polymorphisms may predispose some individuals to be more sensitive to the 
reproductive effects of various agents. 

It is important to consider, however, that the Agency's current methods for developing 
reference doses and reference concentrations (RfDs and RfCs) are designed to protect sensitive 
populations. If data on sensitive human populations are available (and there is confidence in the quality 
of the data), then the RfD is based on the dose level at which no adverse effects are observed in the 
sensitive population. If no such data are available (for example, if the RfD is developed using data from 
humans of average or unknown sensitivity), then an additional 3- to 10-fold factor may be used to 
account for variability between the average human response and the response of more sensitive 
individuals (see Section 4). 

Generally, selection of the population segments to consider for high susceptibility is a matter of 
either a priori interest in the subgroup (e.g., environmental justice considerations), in which case the 
risk assessor and risk manager can jointly agree on which subgroups to highlight, or a matter of 
discovery of a sensitive or highly exposed subgroup during the assessment process. In either case, 
once identified, the subgroup can be treated as a population in itself and characterized in the same way 
as the larger population using the descriptors for population and individual risk. 

6.3.5. Situation-Specific Information 

Presenting situation-specific scenarios for important exposure situations and subpopulations in 
the form of "what if?" questions may be particularly useful to give perspective to risk managers on 
possible future events. The question being asked in these cases is, for any given exposure level, what 
would be the resulting number or proportion of individuals who may be exposed to levels above that 
value? 

"What if ...?" questions, such as those that follow, can be used to examine candidate risk 
management options: 

What are the reproductive risks if a pesticide applicator applies this pesticide without using 
protective equipment? 

What are the reproductive risks if this site becomes residential in the future? 
What are the reproductive risks if we set the standard at 100 ppb? 
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Answering such "what if?" questions involves a calculation of risk based on specific 
combinations of factors postulated within the assessment. The answers to these "what if?" questions do 
not, by themselves, give information about how likely the combination of values might be in the actual 
population or about how many (if any) persons might be subjected to the potential future reproductive 
risk. However, information on the likelihood of the postulated scenario would be desirable to include in 
the assessment. 

When addressing projected changes for a population (either expected future developments or 
consideration of different regulatory options), it usually is appropriate to calculate and consider all the 
reproductive risk descriptors discussed above. When central tendency or high-end estimates are 
developed for a scenario, these descriptors should reflect reasonable expectations about future 
activities. For example, in site-specific risk assessments, future scenarios should be evaluated when 
they are supported by realistic forecasts of future land use, and the reproductive risk descriptors should 
be developed within that context. 

6.3.6. Evaluation of the Uncertainty in the Risk Descriptors 

Reproductive risk descriptors are intended to address variability of risk within the population 
and the overall adverse impact on the population. In particular, differences between high-end and 
central tendency estimates reflect variability in the population but not the scientific uncertainty inherent in 
the risk estimates. As discussed above there will be uncertainty in all estimates of reproductive risk. 
These uncertainties can include measurement uncertainties, modeling uncertainties, and assumptions to 
fill data gaps. Risk assessors should address the impact of each of these factors on the confidence in 
the estimated reproductive risk values. 

Both qualitative and quantitative evaluations of uncertainty provide useful information to users of 
the assessment. The techniques of quantitative uncertainty analysis are evolving rapidly and both the 
SAB (Loehr and Matanoski, 1993) and the NRC (1994) have urged the Agency to incorporate these 
techniques into its risk analyses. However, it should be noted that a probabilistic assessment that uses 
only the assessor's best estimates for distributions of population variables addresses variability, but not 
uncertainty. Uncertainties in the estimated risk distribution need to be evaluated separately. An 
approach has been proposed for estimating distribution of uncertainty in noncancer risk assessments 
(Bairdetal, 1996). 

6.4. SUMMARY AND RESEARCH NEEDS 
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These Guidelines summarize the procedures that the EPA will follow in evaluating the potential 
for agents to cause reproductive toxicity. They discuss the assumptions that must be made in risk 
assessment for reproductive toxicity because of gaps in our knowledge about underlying biologic 
processes and how these compare across species. Research to improve the interpretation of data and 
interspecies extrapolation is needed. This research includes studies that: (1) more completely 
characterize and define female and male reproductive endpoints, (2) more completely characterize the 
types of developmental toxicity possible, (3) evaluate the interrelationships among endpoints, (4) 
examine quantitative extrapolation between endpoints (e.g., sperm count) and function (e.g., fertility), 
(5) provide a better understanding of the relationships between reproductive toxicity and other forms of 
toxicity, (6) explore pharmacokinetic disposition of the target, and (7) examine mechanistic phenomena 
related to pharmacokinetic disposition. These types of studies, along with further evaluation of a 
nonlinear dose-response for susceptible populations, should provide methods to more precisely assess 
risk. 
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PART B: RESPONSE TO SCIENCE ADVISORY BOARD AND PUBLIC COMMENTS 



1. INTRODUCTION 

A notice of availability for public comment of these Guidelines was published in the Federal 
Register (FR) in February 1994. Seven responses were received. These Guidelines were presented 
to the Environmental Health Committee of the Science Advisory Board (SAB) on July 19, 1994. The 
report of the SAB was provided to the Agency in May 1995, with further communication from the 
SAB Executive Committee provided in December 1995. 

The SAB and public comments were diverse and represented varying perspectives. Many of 
the comments were favorable and expressed agreement with positions taken in the proposed guidelines. 
A number of the comments addressed items that were more pertinent to testing guidance than risk 
assessment guidance or were otherwise beyond the scope of these Guidelines. Some of those were 
generic issues that are not system specific. Others were topics that have not been developed 
sufficiently and should be viewed as research issues. There were conflicting views about the need to 
provide additional detailed guidance about decision-making in the evaluation process as opposed to 
promoting extensive use of scientific judgment. Also, comments provided specific suggestions for 
clarification of details. 

2. RESPONSE TO SCIENCE ADVISORY BOARD COMMENTS 

In general, the SAB found "the overall scientific foundations of the draft guidelines' positions to 
be generally sound." However, recommendations were made to improve specific areas. 

The SAB recommended that EPA retain separate sections for identification and dose-response 
assessment in the draft guidelines. In subsequent meetings involving the SAB Executive Committee, 
members of the Clean Air Scientific Advisory Committee, and the Environmental Health Committee, 
this issue was explored further. After discussion, the SAB agreed with expanding the hazard 
identification to include certain components of the dose-response assessment. The resulting hazard 
characterization provides an evaluation of hazard within the context of the dose, route, timing, and 
duration of exposure. The next step, the dose-response analysis, quantitatively evaluates the 
relationship between dose or exposure and severity or probability of effect in humans. EPA has revised 
these Guidelines to reflect that position which is consistent also with the 1994 NRC report, Science 
and Judgment in Risk Assessment . The SAB suggested an alternative scheme for characterizing 
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health effects data in Table 5. The Agency's intent for Table 5 is not to characterize the available data, 
but rather to judge whether the database is sufficient to proceed further in the risk assessment process. 
The text has been modified to clarify the intended use of this table and to ensure that it is consistent with 
the reorganization of the Guidelines into separate hazard characterization and quantitative dose- 
response analysis sections. 

The SAB supported the concept of using a gender neutral default assumption, but indicated that 
more discussion to support this assumption was needed. In particular, the Committee indicated that a 
fuller discussion is needed on "information to the contrary" (to obviate the need for making this default 
assumption), as well as additional guidance for using this and other default assumptions in risk 
characterization. The Agency agrees with this recommendation and provides further guidance on the 
use of the gender neutral default assumption. In keeping with recent Agency guidance on risk 
characterization, discussion on the use of default assumptions has been expanded in the risk 
characterization section of these Guidelines. 

The SAB in its reviews of the reproductive toxicity and neurotoxicity risk assessment guidelines 
discussed assumptions about the behavior of the dose-response curve. The SAB's adv ice has been 
that the Agency examine available data first, and only use nonlinear behav ior as a default if available 
data do not define the dose-response curve. The SAB also recommended that the benchmark dose 
method be considered as a possible alternative to the NOAEL/LOAEL approach. The Agency 
agrees. 

The SAB recommended that more discussion be devoted to the issue of disruption of endocrine 
systems by environmental agents. The section on Endocrine Evaluations has been expanded to include 
endocrine disruption of the reproductive system during development in addition to effects on adults. 

The SAB supported the principle in the Guidelines that more than one negative study is 
necessary to judge that a chemical is unlikely to pose a reproductive hazard. That principle has been 
retained and, as recommended by the SAB, an explicit statement included that data from a second 
species are necessary to determine that sufficient information is available to indicate that an agent is 
unlikely to pose a hazard. 

The SAB recommended that the topic of susceptible populations be expanded and that the 
Guidelines should indicate that relevant information be incorporated into risk assessments when 
possible. To address this issue, the Agency has emphasized potential differences in risks in children at 
different stages of development, females (including pregnant and lactating females), and males, and 
indicated that relevant information on differential risks for susceptible populations should be included in 
the risk characterization section when available. When specific information on differential risks is not 
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available, the Agency will continue to apply a default uncertainty factor to account for potential 
differences in susceptibility. 

The SAB recommended that the Agency provide more specific guidance for exposure 
assessment issues that arise when characterizing exposure for reproductive toxicants. The Agency 
agrees and has indicated that an exposure assessment: include a statement of purpose, scope, level of 
detail, and approach used; present the estimate of exposure and dose by pathway and route for 
individuals, population segments, and populations in a manner appropriate for the intended risk 
characterization; and provide an evaluation of the overall level of confidence (including consideration of 
uncertainty factors) in the estimate of exposure and dose and the conclusions drawn. The SAB 
recommended that the MOE discussion be modified to address specific circumstances where the 
administered dose and the "effective dose" are known to be different. The discussion has been 
modified to emphasize that pharmacokinetic data, when available, be utilized to address such instances. 

The SAB recommended that the Agency expand substantially the discussion of overall strategy 
to evaluate exposure from mixtures, exposures to multiple single agents, and exposures to the same 
agent via different routes. It is anticipated that this type of information will be addressed in the 
Agency's upcoming revisions to the chemical mixture guidelines. 

3. RESPONSE TO PUBLIC COMMENTS 

In addition to numerous supportive statements, several issues were indicated although each 
issue was raised by a very limited number of submissions. Use of the benchmark dose was supported 
along with the suggestion that the amount of text could be reduced on that subject. The text has been 
reduced and reference made to the report, The Use of the Benchmark Dose Approach in Health 
Risk Assessment (U.S. EPA, 1995b). A request was made for increased emphasis on paternally 
mediated effects on offspring. The text in that section has been expanded to provide additional 
discussion and references. Concern was expressed about the existence of constraints on the use of 
professional judgment in the risk assessment process, particularly in determining the relevance and 
sufficiency of the database, in evaluating biological plausibility of statistically different effects, and in the 
determination of uncertainty factors. Requests also have been made to provide additional criteria for 
when and under what conditions the risk assessment process will be used. These Guidelines emphasize 
the importance of using scientific judgment throughout the risk assessment process. They provide 
flexibility to permit EPA's offices and regions to develop specific guidance suited to their particular 
needs. The comment was made that the exposure assessment and risk characterization sections were 
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not developed as well as the rest of the document. In 1992, EPA published Guidelines for Exposure 
Assessment (U.S. EPA, 1992) that were intended to apply generically to noncancer risk assessments. 
These Guidelines only address aspects of exposure that are specific to reproduction and have been 
developed sufficiently. The risk characterization section has been expanded substantially to reflect the 
recent guidance provided within EPA for application in all risk assessments. 
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EXHIBIT D 
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Go Back 

(Links below may not function) 

Print This Page | 

IVOS Sperm Analyzer 

The IVOS is an Integrated Visual Optical System for sperm analysis. Since its development, the name 
IVOS has been synonymous with fast, accurate, reliable and scientifically tested sperm analysis. 



Don't forget to check out our customer profiles pag e to see how 
others are using our sperm analyzers in their lab! 



NEW: Built-in CD Writer! 

Integrated Optics 

Automated, Internal Specimen Stage 

Multispecies Compatibility 

Comparison with CEROS 

Ease of Use 

Results Y ou Need 

Real-Time Quality Control 

Rated Moderately Complex by CLIA 

Summary of Results Provided 



Integrated Optics 



iVOSwith Black Trim 



As the only sperm analysis system with an internal optical system, the IVOS I 
offers distinct advantages to the sperm analysis laboratory. In contrast to the I 
continuous illumination of a microscope, the IVOS uses illumination strobed r 
at 1/1000 of a second to visualize sperm motion. This strobed illumination 
eliminates motion-related blurring along the length of the sperm head, 
resulting in precise sperm tracking. By adding an image capture rate of 60 
frames per second, you get the highest level of accuracy available today for I 
measuring sperm velocities and motion parameters. For added flexibility, the I 
IVOS performs analyses under three types of internal, strobed illumination: 
phase contrast, bright field and multi-wavelength fluorescence. 

Return to Top 

Automated, Internal, Heated Specimen Stage 



As part of the integrated optical system, the unique computer 
controlled specimen stage of the IVOS provides precise control of 
temperature and position during analysis. The stage temperature, 
which may be set from ambient to 40°C, remains constant to within I 
0.5°C. For selection of analysis fields, the stage may be 
programmed for either manual or automatic field selection. For at- 
a-glance monitoring, the current stage temperature and position 
are continuously shown on screen as a real-time, digital display. 
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The IVOS stage accommodates and automatically adjusts for 

commonly used analysis chambers, such as disposable fixed-depth slides, cannulas and Makler chambers. In 
addition, a user-defined chamber setting is available 

Return to Top 

Multispecies Compatibility 

Depending on your need, the IVOS comes with standard Clinical (human) software, TOX software, Swine 
software, Equine Breeders software, Animal Breeders software or Animal Motility software. 

The below video shows canine sperm recorded on the IVOS, under 10x. 



Comparison with CEROS 
So which system is right for you? 



IVOS 


CEROS 


If high-end performance using the 
published standard is key for your lab, 
then the IVOS is the best option. 


If you require the objectivity and 
accuracy of automated analysis, but at 
a more affordable price, then choose 
the CEROS. 


All optics components built into one 
integrated workstation 


External microscope allows computer 
to be stored under the lab bench to 
save space 


Strobe illumination: provides sharpest 
imaging 


Familiar, standard microscope 
illumination 
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Automated, built-in, heated stage for 
precise temperature control and 
sample positioning 


Portable MiniTherm Stage Warmer 
maintains samples at 37°C 


Preselection of fields for fastest 
analysis 


X-Y stage movement increases 
number of fields available for motility 
and morphology 


Required for optional IDENT 
fluorescence capability 


Not compatible with IDENT 
fluorescence capability 
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The IVOS hardware and software have been designed to provide users with a system that offers high-end 
performance while maintaining its ease of use. The intuitive Windows-based software promotes fast learning and 
quickly increases the confidence level of users. Since the settings for the integrated optics are made through the 
software, mechanical adjustments are rarely required. With a minimum investment of time, even users unfamiliar 
with computers will be performing fast, accurate and reliable sperm analyses. You don't need to be a computer 
expert to analyze sperm with the IVOS. 

• Redesigned software interface of Version 12 makes operation intuitive 

• Up to 7 analysis setups (including optic calibrations) may be stored to memory and selected at the touch of 
a button 

• Patient and sample information easily entered from keyboard 

• Simply place the sample chamber on the stage, focus the image, select the fields, and begin analysis 

• On-site customized installation and training provided with all new systems 

Return to Too 

Results You Need 

Whether you need basic counts and motilities or detailed analysis of sperm motion, the IVOS provides you with 
the results you need. Results for motile, progressively motile and static sperm include actual number counted, 
sample total, concentrations and percentages. Additional results include averages and distribution bar charts of 
velocities, motion characteristic and morphometry. 



COUNT SUMMARY 


Category 


Cells 
Counted 


Sample 
(M) 


Concentration 
(M/ml) 


Percent 


Total 
Motile 


540 

376 


121.1 
84.4 


30.3 
21.1 


100 
70 


Progressive 


136 


30,5 


7.8 


25 



Samples of all results screens are available under Sample Screens . 
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Real-Time Quality Control 

The IVOS provides on demand quality control with the Playback feature. Using 
image Playback, the accuracy of the analysis is confirmed. To optimize the 
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system, analysis parameters may be adjusted using the two interactive QC Plots 
or directly from the analysis setup screen. 

Return to Top 

Rated Moderately Complex by CLIA 

Because of its ease of use and high levels of quality control, the IVOS has been awarded a rating of moderately 
complex by CLIA. This means fewer restrictions for you lab! 



i la y o Res i Is Provided 

Counts: 

Total, Motile, Progressive 

% Motile, % Progressively Motile 

Rapid, Medium, Slow and Static Cells 
Concentrations 

Total, Motile, Progressive (millions/ml) 

Rapid, Medium, Slow and Static Cells (millions/ml) 
Mean Values 

VAP: Smoothed Path Velocity (microns/sec) 

VCL: Track Velocity (microns/sec) 

VSL: Straight Line Velocity (microns/sec) 

ALH: Amplitude of Lateral Head Displacement (microns) 

BCF: Beat Cross Frequency (hertz) 

LIN: Linearity (ratio of VSL/VCL) 

STR: Straightness (ratio of VSL/VAP) 

Elongation: head shape (ratio of minor to major axis of sperm head) 
Area: head size (square microns) 
Bar Chart Distributions 

VAP, VCL, VSL, Elongation, ALH, BCF, LIN, STR 
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IVOS Sperm Analyzer 

The IVOS is an Integrated Visual Optical System for sperm analysis. Sine 
development, the name IVOS has been synonymous with fast, accurate, i 
and scientifically tested sperm analysis. 



Don't forget to check out our customer 
profiles pag e to see how others are 
using our sperm analyzers in their lab! 

Features 

• NEW: Built-in CD Writer! 

• Integrated Optics 

• Automated, Internal Specimen 
Stage 

• Multispecies Compatibility 

• Comparison with CEROS 

• Ease of Use 

• Results You Need 

• Real-Time Quality Control 

• Rated Mo der ately Complex by 
CLIA 

• Summary of Results Provided 




Integrated Optics 



As the only sperm analysis system with an 
internal optical system, the IVOS offers distinct 
advantages to the sperm analysis laboratory. In 
contrast to the continuous illumination of a 
microscope, the IVOS uses illumination strobed 
at 1/1000 of a second to visualize sperm 
motion. This strobed illumination eliminates 
motion-related blurring along the length of the 
sperm head, resulting in precise sperm tracking. 
By adding an image capture rate of 60 frames 
per second, you get the highest level of 
accuracy available today for measuring sperm 
velocities and motion parameters. For added 
flexibility, the IVOS performs analyses under 
three types of internal, strobed illumination: 
phase contrast, bright field and multi-wavelength fluorescence. 
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Automated, Internal, Heated Specimen Stage 
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As part of the integrated optical 
system, the unique computer 
controlled specimen stage of the IVOS 



provides precise control of 
temperature and position during 
analysis. The stage temperature, 
which may be set from ambient to 40° 
C, remains constant to within 0.5°C. 
For selection of analysis fields, the 
stage may be programmed for either 
manual or automatic field selection. 
For at-a-glance monitoring, the current 
stage temperature and position are 
continuously shown on screen as a 
real-time, digital display. The IVOS 
stage accommodates and 
automatically adjusts for commonly 




used analysis chambers, such as 
disposable fixed-depth slides, 

cannulas and Makler chambers. In addition, a user-defined chamber setting is 
available 

Return to Top 

Multispecies Compatibility 

Depending on your need, the IVOS comes with standard Clinical (human) softv 
TOX software, Swine software, Equine Bree ders software, Animal Bre ~ 1 • sc 
or Animal Motili ty software. 

The below video shows canine sperm recorded on the IVOS, under 10x. 
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Comparison with CEROS 
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So which system is right for you? 



IVOS 


CEROS 


If high-end performance using the 
published standard is key for your 
lab, then the IVOS is the best option. 


If you require the objectivity and 
accuracy of automated analysis, 
at a more affordable price, then 
choose the CEROS. 


All optics components built into one 
integrated workstation 


External microscope allows compu 
to be stored under the lab bench tc 
save space 


Strobe illumination: provides sharpest 
imaging 


Familiar, standard microscope 
illumination 


Automated, built-in, heated stage for 
precise temperature control and 
sample positioning 


Portable MiniTherm Stage Warmer 
maintains samples at 37°C 


Preselection of fields for fastest 
analysis 


X-Y stage movement increases 
number of fields available for motili 
and morphology 


Required for optional IDENT 
fluorescence capability 


Not compatible with IDENT 
fluorescence capability 
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Ease of Use 

The IVOS hardware and software have been designed to provide users with a 
that offers high-end performance while maintaining its ease of use. The intuitiv 
Windows-based software promotes fast learning and quickly increases the con 
level of users. Since the settings for the integrated optics are made through the 
software, mechanical adjustments are rarely required. With a minimum investn 
time, even users unfamiliar with computers will be performing fast, accurate an 
reliable sperm analyses. You don't need to be a computer expert to analyze sp 
with the IVOS. 

• Redesigned software interface of Version 12 makes operation intuitive 

• Up to 7 analysis setups (including optic calibrations) may be stored to rr 
and selected at the touch of a button 

• Patient and sample information easily entered from keyboard 

• Simply place the sample chamber on the stage, focus the image, select 
fields, and begin analysis 

• On-site customized installation and training provided with all new systen 

Return to Top 

Results You Need 

Whether you need basic counts and motilities or detailed analysis of sperm mc 
the IVOS provides you with the results you need. Results for motile, progressiv 
motile and static sperm include actual number counted, sample total, concentr; 
and percentages. Additional results include averages and distribution bar chart 
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velocities, motion characteristic and morphometry. 



COUNT SUMMARY 


Category 


Cells 
Counted 


Sample 
(M) 


Concentration 
(M/ml) 


Pi 


Total 


540 


121.1 


30.3 




Motile 


376 


84.4 


21.1 




Progressiva 


136 


30.5 


7.6 





Samples of all results screens are available under Sample Screens . 
Return to Top 



Real-Time Quality Control 

The IVOS provides on demand quality control with 
the Playback feature. Using image Playback, the 
accuracy of the analysis is confirmed. To optimize 
the system, analysis parameters may be adjusted 
using the two interactive QC Plots or directly from the 
analysis setup screen. 

Return to Top 

Rated Moderately Complex by CLIA 

Because of its ease of use and high levels of quality control, the IVOS has bee 
awarded a rating of moderately complex by CLIA. This means fewer restriction 
you lab! 

Return to Top 

Summary of Results Provided 

Counts: 

Total, Motile, Progressive 

% Motile, % Progressively Motile 

Rapid, Medium, Slow and Static Cells 
Concentrations 

Total, Motile, Progressive (millions/ml) 

Rapid, Medium, Slow and Static Cells (millions/ml) 
Mean Values 

VAP: Smoothed Path Velocity (microns/sec) 

VCL: Track Velocity (microns/sec) 

VSL: Straight Line Velocity (microns/sec) 

ALH: Amplitude of Lateral Head Displacement (microns) 

BCF: Beat Cross Frequency (hertz) 

LIN: Linearity (ratio of VSL/VCL) 

STR: Straightness (ratio of VSLA/AP) 

Elongation: head shape (ratio of minor to major axis of sperm head) 
Area: head size (square microns) 
Bar Chart Distributions 

VAP, VCL, VSL, Elongation, ALH, BCF, LIN, STR 
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